[ Contents ]
3. Data Element Formats
This section describes the data elements used by OpenPGP.
3.1. Scalar numbers
Scalar numbers are unsigned, and are always stored in big-endian
format. Using n[k] to refer to the kth octet being interpreted, the
value of a two-octet scalar is ((n[0] << 8) + n[1]). The value of a
four-octet scalar is ((n[0] << 24) + (n[1] << 16) + (n[2] << 8) +
n[3]).
3.2. Multi-Precision Integers
Multi-Precision Integers (also called MPIs) are unsigned integers
used to hold large integers such as the ones used in cryptographic
calculations.
An MPI consists of two pieces: a two-octet scalar that is the length
of the MPI in bits followed by a string of octets that contain the
actual integer.
These octets form a big-endian number; a big-endian number can be
made into an MPI by prefixing it with the appropriate length.
Examples:
(all numbers are in hexadecimal)
The string of octets [00 01 01] forms an MPI with the value 1. The
string [00 09 01 FF] forms an MPI with the value of 511.
Additional rules:
The size of an MPI is ((MPI.length + 7) / 8) + 2 octets.
The length field of an MPI describes the length starting from its
most significant non-zero bit. Thus, the MPI [00 02 01] is not formed
correctly. It should be [00 01 01].
3.3. Key IDs
A Key ID is an eight-octet scalar that identifies a key.
Implementations SHOULD NOT assume that Key IDs are unique. The
section, "Enhanced Key Formats" below describes how Key IDs are
formed.
3.4. Text
The default character set for text is the UTF-8 [RFC2279] encoding of
Unicode [ISO10646].
3.5. Time fields
A time field is an unsigned four-octet number containing the number
of seconds elapsed since midnight, 1 January 1970 UTC.
3.6. String-to-key (S2K) specifiers
String-to-key (S2K) specifiers are used to convert passphrase strings
into symmetric-key encryption/decryption keys. They are used in two
places, currently: to encrypt the secret part of private keys in the
private keyring, and to convert passphrases to encryption keys for
symmetrically encrypted messages.
3.6.1. String-to-key (S2k) specifier types
There are three types of S2K specifiers currently supported, as
follows:
3.6.1.1. Simple S2K
This directly hashes the string to produce the key data. See below
for how this hashing is done.
Octet 0: 0x00
Octet 1: hash algorithm
Simple S2K hashes the passphrase to produce the session key. The
manner in which this is done depends on the size of the session key
(which will depend on the cipher used) and the size of the hash
algorithm's output. If the hash size is greater than or equal to the
session key size, the high-order (leftmost) octets of the hash are
used as the key.
If the hash size is less than the key size, multiple instances of the
hash context are created -- enough to produce the required key data.
These instances are preloaded with 0, 1, 2, ... octets of zeros (that
is to say, the first instance has no preloading, the second gets
preloaded with 1 octet of zero, the third is preloaded with two
octets of zeros, and so forth).
As the data is hashed, it is given independently to each hash
context. Since the contexts have been initialized differently, they
will each produce different hash output. Once the passphrase is
hashed, the output data from the multiple hashes is concatenated,
first hash leftmost, to produce the key data, with any excess octets
on the right discarded.
3.6.1.2. Salted S2K
This includes a "salt" value in the S2K specifier -- some arbitrary
data -- that gets hashed along with the passphrase string, to help
prevent dictionary attacks.
Octet 0: 0x01
Octet 1: hash algorithm
Octets 2-9: 8-octet salt value
Salted S2K is exactly like Simple S2K, except that the input to the
hash function(s) consists of the 8 octets of salt from the S2K
specifier, followed by the passphrase.
3.6.1.3. Iterated and Salted S2K
This includes both a salt and an octet count. The salt is combined
with the passphrase and the resulting value is hashed repeatedly.
This further increases the amount of work an attacker must do to try
dictionary attacks.
Octet 0: 0x03
Octet 1: hash algorithm
Octets 2-9: 8-octet salt value
Octet 10: count, a one-octet, coded value
The count is coded into a one-octet number using the following
formula:
#define EXPBIAS 6
count = ((Int32)16 + (c & 15)) << ((c >> 4) + EXPBIAS);
The above formula is in C, where "Int32" is a type for a 32-bit
integer, and the variable "c" is the coded count, Octet 10.
Iterated-Salted S2K hashes the passphrase and salt data multiple
times. The total number of octets to be hashed is specified in the
encoded count in the S2K specifier. Note that the resulting count
value is an octet count of how many octets will be hashed, not an
iteration count.
Initially, one or more hash contexts are set up as with the other S2K
algorithms, depending on how many octets of key data are needed.
Then the salt, followed by the passphrase data is repeatedly hashed
until the number of octets specified by the octet count has been
hashed. The one exception is that if the octet count is less than
the size of the salt plus passphrase, the full salt plus passphrase
will be hashed even though that is greater than the octet count.
After the hashing is done the data is unloaded from the hash
context(s) as with the other S2K algorithms.
3.6.2. String-to-key usage
Implementations SHOULD use salted or iterated-and-salted S2K
specifiers, as simple S2K specifiers are more vulnerable to
dictionary attacks.
3.6.2.1. Secret key encryption
An S2K specifier can be stored in the secret keyring to specify how
to convert the passphrase to a key that unlocks the secret data.
Older versions of PGP just stored a cipher algorithm octet preceding
the secret data or a zero to indicate that the secret data was
unencrypted. The MD5 hash function was always used to convert the
passphrase to a key for the specified cipher algorithm.
For compatibility, when an S2K specifier is used, the special value
255 is stored in the position where the hash algorithm octet would
have been in the old data structure. This is then followed
immediately by a one-octet algorithm identifier, and then by the S2K
specifier as encoded above.
Therefore, preceding the secret data there will be one of these
possibilities:
0: secret data is unencrypted (no pass phrase)
255: followed by algorithm octet and S2K specifier
Cipher alg: use Simple S2K algorithm using MD5 hash
This last possibility, the cipher algorithm number with an implicit
use of MD5 and IDEA, is provided for backward compatibility; it MAY
be understood, but SHOULD NOT be generated, and is deprecated.
These are followed by an 8-octet Initial Vector for the decryption of
the secret values, if they are encrypted, and then the secret key
values themselves.
3.6.2.2. Symmetric-key message encryption
OpenPGP can create a Symmetric-key Encrypted Session Key (ESK) packet
at the front of a message. This is used to allow S2K specifiers to
be used for the passphrase conversion or to create messages with a
mix of symmetric-key ESKs and public-key ESKs. This allows a message
to be decrypted either with a passphrase or a public key.
PGP 2.X always used IDEA with Simple string-to-key conversion when
encrypting a message with a symmetric algorithm. This is deprecated,
but MAY be used for backward-compatibility.
Updated: 1999-05-23 wkoch