• Daniel Borkmann's avatar
    random32: upgrade taus88 generator to taus113 from errata paper · a98814ce
    Daniel Borkmann authored
    Since we use prandom*() functions quite often in networking code
    i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
    from Pierre L'Ecuyer's original paper "Maximally Equidistributed
    Combined Tausworthe Generators", Mathematics of Computation, 65,
    213 (1996), 203--213 to the version published in his errata paper [1].
    
    The Tausworthe generator is a maximally-equidistributed generator,
    that is fast and has good statistical properties [1].
    
    The version presented there upgrades the 3 state LFSR to a 4 state
    LFSR with increased periodicity from about 2^88 to 2^113. The
    algorithm is presented in [1] by the very same author who also
    designed the original algorithm in [2].
    
    Also, by increasing the state, we make it a bit harder for attackers
    to "guess" the PRNGs internal state. See also discussion in [3].
    
    Now, as we use this sort of weak initialization discussed in [3]
    only between core_initcall() until late_initcall() time [*] for
    prandom32*() users, namely in prandom_init(), it is less relevant
    from late_initcall() onwards as we overwrite seeds through
    prandom_reseed() anyways with a seed source of higher entropy, that
    is, get_random_bytes(). In other words, a exhaustive keysearch of
    96 bit would be needed. Now, with the help of this patch, this
    state-search increases further to 128 bit. Initialization needs
    to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
    
    taus88 and taus113 algorithm is also part of GSL. I added a test
    case in the next patch to verify internal behaviour of this patch
    with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
    
    $ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
    $ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
    
    With this seed configuration, in order to compare both, we get
    the following differences:
    
    algorithm                 taus88           taus113
    rands/second [**]         1.61e+08         1.37e+08
    sts_serial(4, 1st run)    WEAK             PASSED
    sts_serial(9, 2nd run)    WEAK             PASSED
    rgb_lagged_sum(31)        WEAK             PASSED
    
    We took out diehard_sums test as according to the authors it is
    considered broken and unusable [4]. Despite that and the slight
    decrease in performance (which is acceptable), taus113 here passes
    all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
    In general, taus/taus113 is considered "very good" by the authors
    of dieharder [5].
    
    The papers [1][2] states a single warm-up step is sufficient by
    running quicktaus once on each state to ensure proper initialization
    of ~s_{0}:
    
    Our selection of (s) according to Table 1 of [1] row 1 holds the
    condition L - k <= r - s, that is,
    
      (32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
    
    with r = k - q and q = (6 2 13 3) as also stated by the paper.
    So according to [2] we are safe with one round of quicktaus for
    initialization. However we decided to include the warm-up phase
    of the PRNG as done in GSL in every case as a safety net. We also
    use the warm up phase to make the output of the RNG easier to
    verify by the GSL output.
    
    In prandom_init(), we also mix random_get_entropy() into it, just
    like drivers/char/random.c does it, jiffies ^ random_get_entropy().
    random-get_entropy() is get_cycles(). xor is entropy preserving so
    it is fine if it is not implemented by some architectures.
    
    Note, this PRNG is *not* used for cryptography in the kernel, but
    rather as a fast PRNG for various randomizations i.e. in the
    networking code, or elsewhere for debugging purposes, for example.
    
    [*]: In order to generate some "sort of pseduo-randomness", since
    get_random_bytes() is not yet available for us, we use jiffies and
    initialize states s1 - s3 with a simple linear congruential generator
    (LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
    initialization from s1. So the above quote from [3] accounts only
    for the time from core to late initcall, not afterwards.
    [**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
    
     [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
     [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
     [3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
     [4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
     [5] http://www.phy.duke.edu/~rgb/General/dieharder.php
    
    Joint work with Hannes Frederic Sowa.
    
    Cc: Florian Weimer <fweimer@redhat.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
    Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    a98814ce
random.h 1.99 KB