This document describes in detail the latest deterministic random number generator (RNG) algorithm used in our CryptoSys range of products since 2007: CryptoSys API and the CryptoSys PKI Toolkit.
The RNG has been implemented to conform to NIST Special Publication 800-90† Recommendation for Random Number Generation Using Deterministic Random Bit Generators [SP80090], first published June 2006, revised March 2007. Entropy is accumulated in "Fortuna" pools as described in Ferguson and Schneier, Practical Cryptography, chapter 10, "Generating Randomness" [FERG03].
† Note 2013-09-21: Our implementation does not use the Dual EC_DRBG component of NIST 800-90 that has been revealed to contain an NSA backdoor.
This supersedes the old RNG implementation used prior to 2007 which complied with ANSI X9.31 Appendix A.2.4 [AX931] (which replaced X9.17 Appendix C). We assert that the NIST method as implemented here is at least as secure as the ANSI X9.31-compliant method under any circumstances.
This document describes the implementation for the Windows® operating system. The Linux implementation is similar but does not use thread-safe techniques and the personalisation string uses fewer parameters.
This information is published for peer review and comment. Please send us a message.
We use the term RNG in this document to mean a cryptographically-secure PRNG, deterministic RNG or deterministic RBG (DRBG). All these terms mean the same thing for our purposes.
RNG ≡ RBG ≡ DRBG ≡ PRNG
In particular, the terms random number generator (RNG) and random bit generator (RBG) are interchangeable. The output from a RNG or RBG is a sequence of zero and one bits. In our case, the output is always in 8-bit blocks (bytes, octets).
Our objective for our RNG is to produce, on request, a sequence of the required number of random bits. The RNG should be in compliance with FIPS 140-2 and NIST SP800-90, and any issues must be documented. The RNG must be compatible with a general-purpose cryptographic library which must be usable on any 32-bit variant of the Windows® operating system in a thread-safe manner. It must not interfere with the operation of the library unless it fatally fails.
We use two basic references for the background theory: NIST Special Publication 800-90 Recommendation for Random Number Generation Using Deterministic Random Bit Generators [SP80090] and Ferguson and Schneier, Practical Cryptography, chapter 10, "Generating Randomness" [FERG03]. Ferguson and Schneier is much easier to read and we have drawn on several sections of their book here (because they can write about the subject much better than we can).
In broad terms, there are three levels of PRNG.
A statistically-random PRNG is not necessarily cryptographically-secure. The concept of security strength is an attempt to quantify just how cryptographically secure it is.
PRNGs work by keeping an internal state. Typically this is a seed and a key, which are kept secret. When a consumer requests random data, a cryptographic algorithm operates on the seed and the key to produce pseudo-random output. The internal state is then updated so that the next request does not produce the same data. Ferguson and Schneier [FERG03] describe a simple generator using AES-256 and a 128-bit counter. NIST SP800-90 [SP80090] specifies a whole smorgasbord of generators using message digest hashes, HMACS, block ciphers and even elliptic curves. The idea is that designers can use whichever cryptographic function is already available to them.
Some typical pseudo-code for a PRNG generator might be:
INPUT: (Key, Seed) OUTPUT: random_data, (Key', Seed') random_data = F(Key, Seed) Key' = F(Key, Seed+1) Seed' = F(Key', Seed) return random_data
where F is a cryptographic function. All the generators are essentially some variant of this.
NIST SP800-90 formalises the resistance to attacks with the concepts of Backtracking Resistance and Prediction Resistance. (In the following, remember that PRNG, RBG and DRBG all mean the same thing.)
Backtracking resistance is provided by ensuring that the DRBG generator algorithm is a one-way function. This is easy: all the DRBG mechanisms in NIST SP800-90 provide backtracking resistance.
Prediction resistance depends on the Reseed process; that is, the ability to effectively reseed after a compromise but before the next request. This is more difficult.
Source: [FERG03]
- The attacker attempts to reconstruct the internal state from the output. This is a classic cryptographic attack, and rather easy to counter using cryptographic techniques.
- The attacker is at some point able to acquire the internal state. If no further entropy is added, then the attacker can follow all the outputs and all the updates of the internal state. This means that if the the PRNG is ever attacked successfully, then it can never recover to a secure state.
To solve: mix in entropy from truly-random events into the internal state.
- If the entropy added is only in small amounts - as it most likely will be - the attacker makes frequent requests for random data from the PRNG. As long as the total amount of entropy added between two such requests is limited to say, 30 bits, then the attacker can simply try all possibilities for the random inputs and recover the new state after the mixing.
The best defence against this particular attack is to pool the incoming events that contain entropy. You collect entropy until you have enough to mix into the internal state without the attacker being able to guess the pooled data. How much is enough? You want to have 128 bits of entropy. But here is the real problem: making any kind of estimate of the amount of entropy is extremely difficult, if not impossible. It depends heavily on how much the attacker knows or can know, but that information is not available to the developers during the design phase.
Source: [FERG03]
To reseed the generator, we need to pool events in a pool large enough that the attacker can no longer enumerate the possible values for the events in the pool. Fortuna solves the problem of how many events to collect in a pool before using it to reseed the generator.
There are 32 pools: P_{0},P_{1},...,P_{31}. Each pool conceptually contains a string of bytes of unbounded length but in practice contains the partly-computed hash of the string as it is assembled in the pool.
Each source distributes its random events over the pools in a cyclical fashion. This ensures that the entropy from each source is distributed more or less evenly over the pools. Each random event is appended to the string in the pool in question.
We reseed the generator every time pool P_{0} is long enough. Reseeds are numbered 1,2,3,.... Depending on the reseed number r, one or more pools are included in the reseed. Pool P_{i} is included if 2^{i} is a divisor of r. Thus P_{0} is used in every reseed, P_{1} every other reseed, P_{2} every fourth reseed, etc. After a pool is used in a reseed, it is reset to the empty string.
Irrespective of how many fake random events the attacker generates, or how many of the events he knows, as long as there is at least one source of random events he can't predict, there will always be a pool that collects enough entropy to defeat him.
To prevent the attacker injecting so many events that even if pool P_{31} does not contain enough randomness between reseeds to recover from a compromise, we limit the speed of the reseeds. A reseed will only be performed if the previous reseed was more than 100 milliseconds ago, so it will take more than 13 years before P_{32} would have been used, had it existed.
Source: [FERG03]
The measure of randomness is called entropy. Entropy measures how uncertain you are about the value. You can think of entropy as the average number of bits you would need to specify the value if you could use an ideal compression algorithm. The more you know about a value, the smaller its entropy is.
Mathematically, the definition of entropy, H(X), for a variable X is
where P[X=x] is the probability that the variable X takes on the value x.
For example, if you have a value consisting of a sequence of 16 bytes that are completely random; that is, each of the sequence of 128 bits (16 bytes x 8 bits/byte = 128 bits) has been chosen in an unbiased manner, then the 16-byte value has 128 bits of entropy. You would say that the security strength of the value is 128 bits, or the amount of work required to break the security is 2^{128} operations. This is computationally infeasable. (The hard part is to select the bytes in an unbiased manner.)
But if you know that each byte has been chosen from the set of, say, the two values {0x00, 0xFF} (so the sequence is made up of bytes that are either 0x00 or 0xFF in some random order), then you only have 16 bits of entropy. Each byte in the sequence has entropy of only 1 bit, so the sequence has 16 bits. Thus the work required to break the security has been reduced to 2^{16} operations: a mere 65,000 guesses.
At the far extreme, if an attacker knows exactly what these 16 bytes are, then you have zero bits of entropy.
So the amount of entropy can be anything between zero and the actual size of the value in bits depending on how much the attacker knows. It's relative to an observer and his knowledge prior to an observation.
From FIPS PUB 140-2 Annex C Approved Random Number Generators for FIPS PUB 140-2 [FIPS140XC], January 24, 2007:
ANNEX C: APPROVED RANDOM NUMBER GENERATORS
Annex C provides a list of the FIPS Approved random number generators applicable to FIPS PUB 140-2
(snip)...
6. National Institute of Standards and Technology, Recommendation for Random Number Generation Using Deterministic Random Bit Generators, Special Publication 800-90, June 2006.
From FIPS PUB 140-2 Section 4.9.2 Self-Tests - Conditional Tests [FIPS140]:
Continuous random number generator test.
If a cryptographic module employs Approved or non-Approved RNGs in an Approved mode of operation, the module shall perform the following continuous random number generator test on each RNG that tests for failure to a constant value.
1. If each call to a RNG produces blocks of n bits (where n > 15), the first n-bit block generated after power-up, initialization, or reset shall not be used, but shall be saved for comparison with the next n-bit block to be generated. Each subsequent generation of an n-bit block shall be compared with the previously generated block. The test shall fail if any two compared n-bit blocks are equal.
Our PRNG functions use the HMAC_DRBG
mechanism
as specified in Section 10.1.2 of NIST SP800-90 with SHA-1 as the underlying hash function.
Highest supported security strength = 128 Output block length (outlen) = 160 bits Maximum number of bits per request = 2^19 bits (i.e. 65536 bytes) Reseed interval = 2^30 (< 2^48 max for HMAC_DRBG) or as dictated by the Fortuna algorithm Maximum length of personalization string = 160 bits Maximum length of entropy input = 5120 bits (32 pools x 160 bits) Maximum additional input length = 2^34 bits (2^31 bytes)
Each thread has its own Generator in Thread Local Storage. Each process has one Accumulator accessed by all Generators and protected by a Critical Section when accessed. The accumulator has 32 "Fortuna" accumulation pools with the minimum pool size before a reseed set to 32 bytes.
The "personalization string" used on instantiation in each thread is derived by hashing the current time, process ID, thread ID, user name and computer name, and so is almost certain to be different each time. The instantiation nonce is a 32-bit value derived from the current time which is incremented on the instantiation of any new Generator in a different thread.
The random events polled by the accumulator include the system time, the clock count, the memory status, the position and class name of each window, the free disk space, and other system parameters. Every event is time-stamped to the accuracy of the system clock, which means that, in the worst case, we have a cumulative hash of the times of every event polled since power up.
In the absolute worst case, if no seed file is used and an attacker can call the
GenerateRandomData
function before any entropy has been generated by the system,
the output is effectively a "strong" hash of the current time and
personalization string with good backtracking resistance.
This is at least equivalent to an X9.31-compliant generator.
We do not make available to the consumer either a reseed_required_flag
or a prediction_resistance_request
.
This is by design to prevent a clash with the Fortuna accumulation system.
Reseeds are carried out when either:
The Fortuna algorithm will reseed every time pool P_{0} is long enough (in our case, when its length is 32 bytes or more) AND the time-since-last-reseed is greater than 100 milliseconds. This is not directly under the consumer's control, although he can force it eventually by repeatedly calling the library functions known to trigger entropy accumulation.
The NIST DRBG mechanism reseeds on either (a) first use; or (b) at the end of the seed life. The seed life of the DRBG mechanism is deliberately set high to reduce the risk of an attacker forcing a reseed by repeatedly requesting random data.
If the DRBG mechanism requires a reseed, then it requests entropy from the Fortuna pools, which is provided unconditionally.
A random number generator Health Check is carried out on power up and every time a new RNG generator is instantiated in a new thread. It can also be carried out on demand. The health check performs self-tests to obtain assurance that the DRBG continues to operate as designed and implemented according to section 11.3 of [SP80090] including Known Answer Testing, Testing the Instantiate Function, Testing the Generate Function, Testing the Reseed Function and Testing the Uninstantiate Function.
We use a 64-bit value for continuous checks as required in Section 4.9.2 of FIPS 140-2. Every time the consumer requests a set of random data, we generate an extra 64-bit value which is not output but is stored for comparison purposes. If this matches the first 64 bits of the next about-to-be-output data, then we throw a catastrophic error. If the length of the requested random data is less than 64 bits, then we pad the about-to-be-output data to 64 bits with zeroes before comparison. On start-up, we generate a 64-bit block that is not used for output but is saved for comparison with the next request.
In pseudo-code:
INPUT: nbits, previous_block OUTPUT: random_data, previous_block' if (previous_block != Null) { random_data = GenerateRandomData(nbits) if (nbits < 64) tocompare = random_data || 0^(64 - nbits) else tocompare = LeftMostBits(random_data, 64) if (tocompare == previous_block) return "catastrophic error" } previous_block' = GenerateRandomData(64) return random_data
A strict reading of FIPS 140-2 would seem to require a check of every successive 64-bit block generated
by the DRBG_Generate
function where the requested number of bits is greater than 64.
We assert that this pointless unless the HMAC-SHA-1 function is corrupted.
Furthermore, and far more serious, storing every generated block to compare with the next would expose a huge
security hole.
We chose the HMAC_DRBG
mechanism with SHA-1 because:
HMAC_DRBG
is preferred to Hash_DRBG
because it is more secure.
There are known problems with
Hash_DRBG
when the DRBG is instantiated with insufficient entropy for the requested
security strength.
But HMAC outputs look random even given knowledge and control over the inputs when an attacker doesn't know the key used.
Ref: [SP80090] p119-120. (In other words, you won't be able to tell if we get it wrong.)
We can see at least the following potential problems. We'd be happy to discuss them if you have some constructive comments.
RNG_Generate
but the system decides to call that function in another thread. In that case, the
generator in the new thread will be initialized again without the extra entropy from the seed file. It will only have the
basic nonce and personalisation string plus a small amount of entropy from the pools that may have been collected
in the interim. How can you tell this has happened? You can't.
Update: The algorithm in all products was updated in 2010
to mix seed and user-generated entropy from all threads into the pools.
Actually, we don't do any of this. We just use the rand()
function. Here is the source code.
We know nobody ever reads this far :-)
#include <stdlib.h> #include <time.h> void rng_bytes(unsigned char *buf, int nbytes) { static unsigned int seed; int i; if (seed == 0) { seed = (unsigned)time(NULL) ^ (unsigned)clock(); srand(seed); } for (i = 0; i < nbytes; i++) buf[i] = rand() & 0xFF; }
For more information or to comment on this page, please send us a message.
This page last updated 4 June 2014