dwww Home | Show directory contents | Find package

                            dpkg technical manual


-----------------------------------------------------------------------------
                                      
                                  Tom Lees

                           <tom@lpsg.demon.co.uk>

                               Version 2.4.13

-----------------------------------------------------------------------------

Copyright © 1997 Tom Lees

License Notice

    APT and this document are free software; you can redistribute them and/or
    modify them under the terms of the GNU General Public License as
    published by the Free Software Foundation; either version 2 of the
    License, or (at your option) any later version.

    For more details, on Debian systems, see the file /usr/share/
    common-licenses/GPL for the full license.

Abstract

This document describes the minimum necessary workings for the APT dselect
replacement. It gives an overall specification of what its external interface
must look like for compatibility, and also gives details of some internal
quirks.

-----------------------------------------------------------------------------

Table of Contents

1. Quick summary of dpkg's external interface

    1.1. Control files
    1.2. The dpkg status area
    1.3. The dpkg library files
    1.4. The "dpkg" command-line utility

        1.4.1. "Documented" command-line interfaces
        1.4.2. Environment variables which dpkg responds to
        1.4.3. Assertions
        1.4.4. --predep-package

2. dpkg-deb and .deb file internals

    2.1. The .deb archive format
    2.2. The dpkg-deb command-line

        2.2.1. Internal checks used by dpkg-deb when building packages

3. dpkg internals

    3.1. Updates
    3.2. What happens when dpkg reads the database
    3.3. How dpkg compares version numbers

Chapter 1. Quick summary of dpkg's external interface

Table of Contents

1.1. Control files
1.2. The dpkg status area
1.3. The dpkg library files
1.4. The "dpkg" command-line utility

    1.4.1. "Documented" command-line interfaces
    1.4.2. Environment variables which dpkg responds to
    1.4.3. Assertions
    1.4.4. --predep-package

1.1. Control files

    The basic dpkg package control file supports the following major
    features:-

      * 5 types of dependencies:-

          + Pre-Depends, which must be satisfied before a package may be
            unpacked

          + Depends, which must be satisfied before a package may be
            configured

          + Recommends, to specify a package which if not installed may
            severely limit the usefulness of the package

          + Suggests, to specify a package which may increase the
            productivity of the package

          + Conflicts, to specify a package which must NOT be installed in
            order for the package to be configured

          + Breaks, to specify a package which is broken by the package and
            which should therefore not be configured while broken

        Each of these dependencies can specify a version and a dependency on
        that version, for example "<= 0.5-1", "== 2.7.2-1", etc. The
        comparators available are:-

          + "<<" - less than
   
          + "<=" - less than or equal to

          + ">>" - greater than

          + ">=" - greater than or equal to

          + "==" - equal to

      * The concept of "virtual packages", which many other packages may
        provide, using the Provides mechanism. An example of this is the
        "httpd" virtual package, which all web servers should provide.
        Virtual package names may be used in dependency headers. However,
        current policy is that virtual packages do not support version
        numbers, so dependencies on virtual packages with versions will
        always fail.

      * Several other control fields, such as Package, Version, Description,
        Section, Priority, etc., which are mainly for classification
        purposes. The package name must consist entirely of lowercase
        characters, plus the characters '+', '-', and '.'. Fields can extend
        across multiple lines - on the second and subsequent lines, there is
        a space at the beginning instead of a field name and a ':'. Empty
        lines must consist of the text " .", which will be ignored, as will
        the initial space for other continuation lines. This feature is
        usually only used in the Description field.

1.2. The dpkg status area

    The "dpkg status area" is the term used to refer to the directory where
    dpkg keeps its various status files (GNU would have you call it the dpkg
    shared state directory). This is always, on Debian systems, /var/lib/
    dpkg. However, the default directory name should not be hard-coded, but #
    define'd, so that alteration is possible (it is available via configure
    in dpkg 1.4.0.9 and above). Of course, in a library, code should be
    allowed to override the default directory, but the default should be part
    of the library (so that the user may change the dpkg admin dir simply by
    replacing the library).

    Dpkg keeps a variety of files in its status area. These are discussed
    later on in this document, but a quick summary of the files is here:-

      * available - this file contains a concatenation of control information
        from all the packages which dpkg knows about. This is updated using
        the dpkg commands "--update-avail <file>", "--merge-avail <file>",
        and "--clear-avail".

      * status - this file contains information on the following things for
        every package:-

          + Whether it is installed, not installed, unpacked, removed, failed
            configuration, or half-installed (deconfigured in favour of
            another package).

          + Whether it is selected as install, hold, remove, or purge.

          + If it is "ok" (no installation problems), or "not-ok".

          + It usually also contains the section and priority (so that
            dselect may classify packages not in available)

          + For packages which did not initially appear in the "available"
            file when they were installed, the other control information for
            them.

        The exact format for the "Status:" field is:

              Status: Want Flag Status

        Where Want may be one of unknown, install, hold, deinstall, purge.
        Flag may be one of ok, reinstreq. Status may be one of not-installed,
        config-files, half-installed, unpacked, half-configured and installed
        . The states are as follows:-

        not-installed

            No files are installed from the package, it has no config files
            left, it uninstalled cleanly if it ever was installed.

        unpacked

            The basic files have been unpacked (and are listed in /var/lib/
            dpkg/info/[package].list. There are config files present, but the
            postinst script has _NOT_ been run.

        half-configured

            The package was installed and unpacked, but the postinst script
            failed in some way.
   
        installed

            All files for the package are installed, and the configuration
            was also successful.

        half-installed

            An attempt was made to remove the package but there was a failure
            in the prerm script.

        config-files

            The package was "removed", not "purged". The config files are
            left, but nothing else.

        The two last items are only left in dpkg for compatibility - they are
        understood by it, but never written out in this form.

        Please see the dpkg source code, lib/parshelp.c, statusinfos, 
        eflaginfos and wantinfos for more details.

      * info - this directory contains files from the control archive of
        every package currently installed. They are installed with a prefix
        of "<packagename>.". In addition to this, it also contains a file
        called <package>.list for every package, which contains a list of
        files. Note also that the control file is not copied into here; it is
        instead found as part of status or available.

      * methods - this directory is reserved for "method"-specific files -
        each "method" has a subdirectory underneath this directory (or at
        least, it can have). In addition, there is another subdirectory
        "mnt", where misc. filesystems (floppies, CD-ROMs, etc.) are mounted.

      * alternatives - directory used by the "update-alternatives" program.
        It contains one file for each "alternatives" interface, which
        contains information about all the needed symlinked files for each
        alternative.

      * diversions - file used by the "dpkg-divert" program. Each diversion
        takes three lines. The first is the package name (or ":" for user
        diversion), the second the original filename, and the third the
        diverted filename.

      * updates - directory used internally by dpkg. This is discussed later,
        in the section Section 3.1, “Updates”.

      * parts - temporary directory used by dpkg-split

1.3. The dpkg library files

    These files are installed under /usr/lib/dpkg (usually), but /usr/local/
    lib/dpkg is also a possibility (as Debian policy dictates). Under this
    directory, there is a "methods" subdirectory. The methods subdirectory in
    turn contains any number of subdirectories for each general method
    processor (note that one set of method scripts can, and is, used for more
    than one of the methods listed under dselect).

    The following files may be found in each of these subdirectories:-

      * names - One line per method, two-digit priority to appear on menu at
        beginning, followed by a space, the name, and then another space and
        the short description.

      * desc.<name> - Contains the long description displayed by dselect when
        the cursor is put over the <name> method.

      * setup - Script or program which sets up the initial values to be used
        by this method. Called with first argument as the status area
        directory (/var/lib/dpkg), second argument as the name of the method
        (as in the directory name), and the third argument as the option (as
        in the names file).

      * install - Script/program called when the "install" option of dselect
        is run with this method. Same arguments as for setup.

      * update - Script/program called when the "update" option of dselect is
        run. Same arguments as for setup/install.

1.4. The "dpkg" command-line utility

1.4.1. "Documented" command-line interfaces

    As yet unwritten. You can refer to the other manuals for now. See dpkg(8)
    .

1.4.2. Environment variables which dpkg responds to

      * SHELL - used to determine which shell to run.

      * CC - used as the C compiler to call to determine the target
        architecture. The default is "gcc".

      * PATH - dpkg checks that it can find at least the following files in
        the path when it wants to run package installation scripts, and gives
        an error if it cannot find all of them:-

          + ldconfig

          + start-stop-daemon

          + install-info

          + update-rc.d

1.4.3. Assertions

    The dpkg utility itself is required for quite a number of packages, even
    if they have been installed with a tool totally separate from dpkg. The
    reason for this is that some packages, in their pre-installation scripts,
    check that your version of dpkg supports certain features. This was
    broken from the start, and it should have actually been a control file
    header "Dpkg-requires", or similar. What happens is that the
    configuration scripts will abort or continue according to the exit code
    of a call to dpkg, which will stop them from being wrongly configured.

    These special command-line options, which simply return as true or false
    are all prefixed with "--assert-". Here is a list of them (without the
    prefix):-

      * support-predepends - Returns success or failure according to whether
        a version of dpkg which supports predepends properly (1.1.0 or above)
        is installed, according to the database.
   
      * working-epoch - Return success or failure according to whether a
        version of dpkg which supports epochs in version properly (1.4.0.7 or
        above) is installed, according to the database.

    Both these options check the status database to see what version of the
    "dpkg" package is installed, and check it against a known working
    version.

1.4.4. --predep-package

    This strange option is described as follows in the source code:

    /* Print a single package which:
     *  (a) is the target of one or more relevant predependencies.
     *  (b) has itself no unsatisfied pre-dependencies.
     * If such a package is present output is the Packages file entry,
     *  which can be massaged as appropriate.
     * Exit status:
     *  0 = a package printed, OK
     *  1 = no suitable package available
     *  2 = error
     */

    On further inspection of the source code, it appears that what is does is
    this:-

      * Looks at the packages in the database which are selected as
        "install", and are installed.

      * It then looks at the Pre-Depends information for each of these
        packages from the available file. When it find a package for which
        any of the pre-dependencies are not satisfied, it breaks from the
        loop through the packages.

      * It then looks through the unsatisfied pre-dependencies, and looks for
        packages which would satisfy this pre-dependency, stopping on the
        first it finds. If it finds none, it bombs out with an error.

      * It then continues this for every dependency of the initial package.

    Eventually, it writes out the record of all the packages to satisfy the
    pre-dependencies. This is used by the disk method to make sure that its
    dependency ordering is correct. What happens is that all pre-depending
    packages are first installed, then it runs dpkg -iGROEB on the directory,
    which installs in the order package files are found. Since
    pre-dependencies mean that a package may not even be unpacked unless they
    are satisfied, it is necessary to do this (usually, since all the package
    files are unpacked in one phase, the configured in another, this is not
    needed).

Chapter 2. dpkg-deb and .deb file internals

Table of Contents

2.1. The .deb archive format
2.2. The dpkg-deb command-line

    2.2.1. Internal checks used by dpkg-deb when building packages

    This chapter describes the internals to the "dpkg-deb" tool, which is
    used by "dpkg" as a back-end. dpkg-deb has its own tar extraction
    functions, which is the source of many problems, as it does not support
    long filenames, using extension blocks.

2.1. The .deb archive format

    The main principal of the new-format Debian archive (I won't describe the
    old format - for that have a look at deb-old.5), is that the archive
    really is an archive - as used by "ar" and friends. However, dpkg-deb
    uses this format internally, rather than calling "ar". Inside this
    archive, there are usually the following members:-

      * debian-binary

      * control.tar.gz

      * data.tar.gz

    The debian-binary member consists simply of the string "2.0", indicating
    the format version. control.tar.gz contains the control files (and
    scripts), and the data.tar.gz contains the actual files to populate the
    filesystem with. Both tarfiles extract straight into the current
    directory. Information on the tar formats can be found in the GNU tar
    info page. Since dpkg-deb calls "tar -cf" to build packages, the Debian
    packages use the GNU extensions.

2.2. The dpkg-deb command-line

    dpkg-deb documents itself thoroughly with its '--help' command-line
    option. However, I am including a reference to these for completeness.
    dpkg-deb supports the following options:-

      * --build (-b) <dir> - builds a .deb archive, takes a directory which
        contains all the files as an argument. Note that the directory <dir>/
        DEBIAN will be packed separately into the control archive.

      * --contents (-c) <debfile> - Lists the contents of the "data.tar.gz"
        member.

      * --control (-e) <debfile> - Extracts the control archive into a
        directory called DEBIAN. Alternatively, with another argument, it
        will extract it into a different directory.

      * --info (-I) <debfile> - Prints the contents of the "control" file in
        the control archive to stdout. Alternatively, giving it other
        arguments will cause it to print the contents of those files instead.

      * --field (-f) <debfile> <field> ... - Prints any number of fields from
        the "control" file. Giving it extra arguments limits the fields it
        prints to only those specified. With no command-line arguments other
        than a filename, it is equivalent to -I and just the .deb filename.

      * --extract (-x) <debfile> <dir> - Extracts the data archive of a
        debian package under the directory <dir>.
   
      * --vextract (-X) <debfile> <dir> - Same as --extract, except it is
        equivalent of giving tar the '-v' option - it prints the filenames as
        it extracts them.

      * --fsys-tarfile <debfile> - This option outputs a gunzip'd version of
        data.tar.gz to stdout.

      * --new - sets the archive format to be used to the new Debian format

      * --old - sets the archive format to be used to the old Debian format

      * --debug - Tells dpkg-deb to produce debugging output

      * --nocheck - Tells dpkg-deb not to check the sanity of the control
        file

      * --help (-h) - Gives a help message

      * --version - Shows the version number

      * --licence/--license (UK/US spellings) - Shows a brief outline of the
        GPL

2.2.1. Internal checks used by dpkg-deb when building packages

    Here is a list of the internal checks used by dpkg-deb when building
    packages. It is in the order they are done.

      * First, the output Debian archive argument, if it is given, is checked
        using stat. If it is a directory, an internal flag is set. This check
        is only made if the archive name is specified explicitly on the
        command-line. If the argument was not given, the default is the
        directory name, with ".deb" appended.

      * Next, the control file is checked, unless the --nocheck flag was
        specified on the command-line. dpkg-deb will bomb out if the second
        argument to --build was a directory, and --nocheck was specified.
        Note that dpkg-deb will not be able to determine the name of the
        package in this case. In the control file, the following things are
        checked:-

          + The package name is checked to see if it contains any invalid
            characters (see Section 1.1, “Control files” for this).

          + The priority field is checked to see if it uses standard values,
            and user-defined values are warned against. However, note that
            this check is now redundant, since the control file no longer
            contains the priority - the changes file now does this.

          + The control file fields are then checked against the standard
            list of fields which appear in control files, and any
            "user-defined" fields are reported as warnings.

          + dpkg-deb then checks that the control file contains a valid
            version number.

      * After this, in the case where a directory was specified to build the
        .deb file in, the filename is created as "directory/pkg_ver.deb" or
        "directory/pkg_ver_arch.deb", depending on whether the control file
        contains an architecture field.

      * Next, dpkg-deb checks for the <dir>/DEBIAN directory. It complains if
        it doesn't exist, or if it has permissions < 0755, or > 0775.

      * It then checks that all the files in this subdir are either symlinks
        or plain files, and have permissions between 0555 and 0775.

      * The conffiles file is then checked to see if the filenames are too
        long. Warnings are produced for each that is. After this, it checks
        that the package provides initial copies of each of these conffiles,
        and that they are all plain files.

Chapter 3. dpkg internals

Table of Contents

3.1. Updates
3.2. What happens when dpkg reads the database
3.3. How dpkg compares version numbers

    This chapter describes the internals of dpkg itself. Although the
    low-level formats are quite simple, what dpkg does in certain cases often
    does not make sense.

3.1. Updates

    This describes the /var/lib/dpkg/updates directory. The function of this
    directory is somewhat strange, and seems only to be used internally. A
    function called cleanupdates is called whenever the database is scanned.
    This function in turn uses scandir(3), to sort the files in this
    directory. Files who names do not consist entirely of digits are
    discarded. dpkg also causes a fatal error if any of the filenames are
    different lengths.

    After having scanned the directory, dpkg in turn parses each file the
    same way it parses the status file (they are sorted by the scandir to be
    in numerical order). After having done this, it then writes the status
    information back to the "status" file, and removes all the "updates"
    files.

    These files are created internally by dpkg's "checkpoint" function, and
    are cleaned up when dpkg exits cleanly.

    Judging by the use of the updates directory I would call it a Journal.
    Inorder to efficiently ensure the complete integrity of the status file
    dpkg will "checkpoint" or journal all of it's activities in the updates
    directory. By merging the contents of the updates directory (in order!!)
    against the original status file it can get the precise current state of
    the system, even in the event of a system failure while dpkg is running.

    The other option would be to sync-rewrite the status file after each
    operation, which would kill performance.

    It is very important that any program that uses the status file abort if
    the updates directory is not empty! The user should be informed to run
    dpkg manually (what options though??) to correct the situation.

3.2. What happens when dpkg reads the database

    First, the status file is read. This gives dpkg an initial idea of the
    packages that are there. Next, the updates files are read in, overriding
    the status file, and if necessary, the status file is re-written, and
    updates files are removed. Finally, the available file is read. The
    available file is read with flags which preclude dpkg from updating any
    status information from it, though - installed version, etc., and is also
    told to record that the packages it reads this time are available, not
    installed.

    More information on updates is given above.

3.3. How dpkg compares version numbers

    Version numbers consist of three parts: the epoch, the upstream version,
    and the Debian revision. Dpkg compares these parts in that order. If the
    epochs are different, it returns immediately, and so on.

    However, the important part is how it compares the versions which are
    essentially stored as just strings. These are compared in two distinct
    parts: those consisting of numerical characters (which are evaluated, and
    then compared), and those consisting of other characters. When comparing
    non-numerical parts, they are compared as the character values (ASCII),
    but non-alphabetical characters are considered "greater than"
    alphabetical ones. Also note that longer strings (after excluding
    differences where numerical values are equal) are considered "greater
    than" shorter ones.

    Here are a few examples of how these rules apply:-

    15 > 10
    0010 == 10
   
    d.r > dsr
    32.d.r == 0032.d.r
    d.rnr < d.rnrn

Generated by dwww version 1.14 on Sun Feb 2 13:39:08 CET 2025.