INTRODUCTION This is the development area for OpenDKIM's statistics collection system. In earlier versions, the statistics system gathered numerous details about messages and their DKIM signatures in order to produce a report about the observed implementation of DKIM on the Internet. The data recorded included the number of signatures, percent of signature failures, first-party vs. third-party signatures, ADSP statistics, etc. These data were collected and submitted to the Internet Engineering Task Force as part of a requirement to advance DKIM to Draft Standard status, and this work was very successful. As of v2.5.0, the statistics system has been repurposed to support ongoing research and development into open DKIM-based reputation services. The data collected has been drastically reduced to include only information about message arrival times, From domains, signing domains, sending IP addresses, and whether or not the message was reported as spam. You are invited, but certainly not required, to provide this information to The Trusted Domain Project. The code that does the reporting is open source and included in this package, so you can easily verify that nothing other than the above is being revealed in the data thus submitted. Participants that submit data will be invited to be part of an experimental reputation system based on DKIM that should begin to appear in the near future. This directory contains information and software for generating, receiving and processing such reports, including commands for creating a MySQL database to store the reports for later query, and a program that can receive reports generated by opendkim-stats to put such data directly into that database. INSTALLATION These instructions are for setting up your own reputation database and a feed to it. For instructions about providing a feed to an external service only, skip to the EXTERNAL FEEDS section. This system can be made to work with any SQL-based system, but the provided scripts and documentation presume MySQL. 1. Compile OpenDKIM using the "--enable-stats" option: % ./configure --enable-stats % make If you wish to apply local extensions to the statistics reporting system, see the EXTENSIONS section below. 2. Install MySQL. Create credentials for an "opendkim" user, optionally with some password. Create a database called "opendkim". Note that this does NOT refer to a "real" UNIX user and password. The user you create must be granted SELECT and INSERT access to that table. 3. From inside the MySQL client, source the "mkdb.mysql" script. This will create the tables required to store the statistics reports. 4. Install the "opendkim-stats" program someplace. 5. Configure OpenDKIM to have statistics enabled, and begin reporting them to a file someplace. This involves the "Statistics" setting. See the opendkim(8) and opendkim.conf(5) man pages for details. 6. Restart opendkim. 7a. To get a human-readable form of the recorded statistics, use: opendkim-stats /path/to/stats/file ...using, of course, the path to the statistics file you configure in opendkim.conf. You can reset the contents of that file at any time by simply removing it or copying /dev/null on top of it. The filter will create/append to the file on the next received message. 7b. To translate records in the recorded statistics file into SQL insert operations, use: opendkim-importstats /path/to/stats/file ...again using the path you specified in opendkim.conf. You will probably also need one or more of the following: -d dbname database name (default "opendkim") -p dbpasswd database user's password (no default) -s scheme database scheme (default "mysql") -u dbuser database user (default "opendkim") Append "-r" to this to remove the statistics file on completion. You can also use the provided opendkim-genstats script to generate useful reports from the accumulated data. Contribution of other reports you find useful would be welcome. 8. To participate in The Trusted Domain Project's data collection work, ask for the submission address you should use and then execute the following command: opendkim-reportstats -register This will generate a GPG signing key pair and send the public key to The Trusted Domain Project so your signatures can be verified. Upon receipt of your key, we will add it to our key ring, which will cause our system to begin trusting your reports. We will notify you when this has happened. Once this is done, you can arrange to have your cron job execute this command: opendkim-reportstats -sendstats This will send a signed copy of the current statistics data to The Trusted Domain Project and add the ".old" suffix to the filename. The filter will automatically start a new file when the next message is received.. When your data copy is received, The OpenDKIM project will use the aforementioned "opendkim-importstats" tool to import your data into the central database. You can view regularly generated reports, including your data, at: http://www.opendkim.org/stats/report.html EXTERNAL FEEDS If you are interested in producing statistics for the purpose of exporting them to a data aggregator other than yourself, follow steps 1 and 5, 6 and 8 above only. FILE FORMAT The format of the file written by the opendkim filter is described here. If demand appears, a stable API for accessing it will be provided. Until then, application developers are advised not to rely on this information being stable between versions. A statistics file consists of lines of ASCII data which are delimited from each other by a single LF (ASCII 10). Empty lines or lines beginning with other than alphabetic characters are ignored completely. A line in the file that begins with a capital letter identifies the type of record it represents. The first such line also implicitly ends the global values section. There are currently these record types: M identifies a message S identifies a signature U updates a message A message record is a tab-separated, ordered sequence of fields, as follows: MTA-provided job/envelope ID (string) reporter (string; defaults to hostname) first domain found in the From: header field (string) SMTP client IP address (string) UNIX timestamp of message receive time message size, in bytes signature count ATPS status (-1 = not checked, 0 = no, 1 = yes) spam status (-1 = not checked, 0 = no, 1 = yes) A signature record implicitly references the preceding message record. There may be more than one signature record per message; there could also be none. As above, a signature record is a tab-separated, ordered sequence of fields, as follows: domain of the signature pass (0 = no, 1 = yes) failed due to "bh" mismatch (0 = no, 1 = yes) "l=" tag value (-1 = not present) error code from signature DNSSEC value (see DKIM_DNSSEC_* constants from dkim.h) Fields for which no value is known or appropriate should be represented as "-" in the file. UPDATES As of v2.5.0, a "spam" column is present in the table tracking per-message information. When a batch of new message information is sent, this field is typically populated with a spam value of 0 (not spam). An update record can be sent later that changes the value in this column. The fields in this case are: MTA-provided job/envelope ID (string) reporter (string; defaults to hostname) UNIX timestamp of message receive time (0 if not known) spam indicator (-1 = not tested, 0 = not spam, 1 = spam) The first three fields must match those given for the corresponding "M" (new message) record at some time in the past. Where the timestamp is 0, the latest record matching the first two is updated. The "spam" column for that message will be replaced with the value found in the fourth field. EXTENSIONS For the purpose of allowing local experimentation, it is possible to extend the statistics reporting to include data about a message not covered by the basic schema distributed with OpenDKIM. Extension data are collected and stored during execution of the Lua "final" script and then written out to the stats file associated with that message. These items are written on "X" lines in the form "name value" (i.e. name and value separated by whitespace). opendkim-importstats will insert these into your database using "name" as the name of the column to be updated and "value" as the value to be placed there. Choosing the "--enable-statsext" flag at build configuration time adds support for this. The mechanism for recording these extra statistics is the odkim.stats() function, detailed in the opendkim-lua(3) man page. In this way, one can create any supplementary message-specific columns as is desirable and populate them whatever way is appropriate for each. Participants that find particular additional columns produce interesting correlations are encouraged to share them with the OpenDKIM community. CONVERTING TO THE IPADDRS SCHEMA In version 2.3.0 of OpenDKIM, the schema changed to add a new "ipaddrs" table, moving that data from each row of the "messages" table (causing duplication). The opendkim-importstats tool in 2.3.0 expects the new schema and is not back-compatible with prior releases. To convert your existing databases, use these steps: 1) Run the stats/mkdb.mysql script from inside your MySQL client to create the required new table. Existing tables will not be altered. 2) Run the following additional MySQL commands (lines are wrapped here for readability but each represents a single command): a) LOCK TABLES messages WRITE, ipaddrs WRITE; b) ALTER TABLE messages ADD COLUMN ip INT UNSIGNED AFTER ipaddr; c) INSERT INTO ipaddrs (addr, firstseen) SELECT DISTINCT ipaddr, MIN(msgtime) FROM messages GROUP BY ipaddr; d) UPDATE messages SET ip = (SELECT id FROM ipaddrs WHERE addr = messages.ipaddr); e) ALTER TABLE messages MODIFY COLUMN ip INT UNSIGNED NOT NULL, DROP COLUMN ipaddr, ADD CONSTRAINT FOREIGN KEY(ip) REFERENCES ipaddrs(id) ON DELETE CASCADE; f) UNLOCK TABLES; The various ALTER TABLE commands may require a rebuild of the "messages" table. This can take a large amount of time if the table is large.
Generated by dwww version 1.14 on Thu Jan 23 03:26:25 CET 2025.