Hi all, first of all, my apologies for the patch bomb following up in reply to this mail here -- it's not meant to receive any serious review at all, but only to support the discussion I'm hoping to get going. As some of you might already be aware of, all new submissions for FIPS certification will be required to comply with NIST SP800-90B from Nov 7th on ([1], sec. 7.18 "Entropy Estimation and Compliance with SP 800-90B"). For reference: broadly speaking, NIST SP800-90B is about noise sources, SP800-90A about the DRBG algorithms stacked on top and SP800-90C about how everything is supposed to be glued together. The main requirements from SP800-90B are - no correlations between different noise sources, - to continuously run certain health tests on a noise source's output and - to provide an interface enabling access to the raw noise samples for validation purposes. To my knowledge, all SP800-90B compliant noise sources available on Linux today are either based on the Jitter RNG one way or another or on architectural RNGs like e.g. x86's RDSEED or arm64's RNDRRS. Currently, there's an in-kernel Jitter RNG implementation getting registered (c.f. crypto/drbg.c, (*)) with the Crypto RNG API, which is also accessible from userspace via AF_ALG. The userspace haveged ([2]) or jitterentropy integrations ([3]) are worth mentioning in this context, too. So in summary, I think that for the in-kernel entropy consumers falling under the scope of FIPS, the currently only way to stay compliant would be to draw it from said Crypto API RNG. For userspace applications there's the additional option to invoke haveged and alike. OTOH, CPU jitter based techniques are not uncontroversial ([4]). In any case, it would certainly be a good idea to mix (xor or whatever) any jitter output with entropy obtained from /dev/random (**). If I'm not mistaken, the mentioned Crypto API RNG implementation (crypto/drbg.c) follows exactly this approach, but doesn't enforce it yet: there's no wait_for_random_bytes() and early DRBG invocations could in principle run on seeds dominated entirely by jitterentropy. However, this can probably get sorted quite easily and thus, one reasonable way towards maintaining FIPS resp. SP800-90 compliance would be to - make crypto/drbg.c invoke wait_for_random_bytes(), - make all relevant in-kernel consumers to draw their random numbers from the Crypto RNG API, if not already the case and - convert all relevant userspace to use a SP800-90B conforming Jitter RNG style noise source for compliance reasons, either by invoking the kernel's Crypto RNG API or by diffent means, and mix that with /dev/random. Even though this would probably be feasible, I'm not sure that giving up on /dev/random being the primary, well established source of randomness in favor of each and every userspace crypto library rolling its own entropy collection scheme is necessarily the best solution (it might very well be though). An obvious alternative would be to make /dev/random conform to SP800-90B. Stephan Müller posted his "LRNG" patchset ([5]), in which he proposed to introduce a second, independent implementation aiming at SP800-90[A-C] conformance. However, it's in the 35th iteration now and my impression is that there's hardly any discussion happening around this for quite a while now. I haven't followed the earlier development, but I can imagine several reasons for that: - people are not really interested in FIPS or even questioning the whole concept in the first place (c.f. Theodore Ts'o remarks on this topic at [6]), - potential reviewers got merely discouraged by the diffstat or - people dislike the approach of having two competing implementations for what is basically the same functionality in the kernel. In either case, I figured it might perhaps help further discussion to provide at least a rough idea of how bad the existing /dev/random implementation would get cluttered when worked towards SP800-90B compliance. So I implemented the required health tests for the interrupt noise source -- the resulting patches can be found in reply to this mail. I'd like to stress(!) that this should really only be considered a first step and that there would still be a long way towards a complete solution; known open items are listed below. Also, I'm fully aware that making those continuous health tests block the best effort primary_crng reseeds upon failure is a ridiculous thing to do -- that's again meant for demonstration purposes only, c.f. the commit log from the next to last patch. Anyway, those of you who are interested in some more details beyond the mere diffstat can find them after the list of references below. In summary, I can imagine three feasible ways towards SP800-90 compliance: 1.) Put the burden on consumers. For in-kernel users this would mean conversion to the Jitter backed Crypto RNG API, in case that hasn't happened yet. Userspace is free to use any approved Jitter based mechanism for compliance reasons, but is encouraged to mix that with /dev/random. 2.) Merge Stephan's LRNG. Users/distros would have to decide between either of the two competing implementations at kernel config time. 3.) Develop the existing /dev/random towards compliance, ideally w/o affecting !fips_enabled users too much. This would likely require some redundancies as well as some atrocities imposed by the specs. I'm looking forward to hearing your opinions and suggestions! In case you happen to know of anybody who's not on CC but might potentially be interested in FIPS, I'd highly appreciate it if you could point him/her to this thread. The usual suspects are probably (enterprise?) distro folks, but there might be others I haven't thought of. Many thanks for your time! Nicolai (*) That's an oversimplification for the sake of brevity: actually SP800-90A DRBGs stacked on top of the SP800-90B conforming jitterentropy source get registered with the Crypto API. (**) "/dev/random" is used as a synonym for everything related to drivers/char/random.c throughout this mail. [1] https://csrc.nist.gov/csrc/media/projects/cryptographic-module-validation-program/documents/fips140-2/fips1402ig.pdf [2] http://www.issihosts.com/haveged/ [3] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.html c.f. appendices C-E [4] https://lwn.net/Articles/642166/ [5] https://lkml.kernel.org/r/5667034.lOV4Wx5bFT@xxxxxxxxxxxxxxxxxxx [6] https://lkml.kernel.org/r/20170919133959.5fgtioyonlsdyjf5@xxxxxxxxx https://lkml.kernel.org/r/20170920011642.cczekznqebf2zq5u@xxxxxxxxx [7] https://lkml.kernel.org/r/aef70b42-763f-0697-f12e-1b8b1be13b07@xxxxxxxxx As promised above, some more details on the RFC series sent alongside follow. The primary goal was to implement that health test functionality as required by SP800-90B for the existing drivers/char/random.c without affecting !fips_enabled users in any way. As outlined below, I failed quite miserably as far as performance is concerned, but that shouldn't be something which cannot get rectified. Kernel version v5.9-rc4 had been used as a basis. The series can be logically subdivided into the following parts: - [1-5]: Preparatory cleanup. - [6-17]: Implement support for deferring entropy credit dispatch to the global balance to long after the corresponding pool mixing operation has taken place. Needed for "holding back" entropy until the health tests have finished on the latest pending batch of samples. - [18-21]: Move arch_get_random_{seed_,}long() out of the interrupt path. Needed to adhere to how SP800-90C expects multiple noise source to get combined, but is also worthwhile on its own from a performance POV. - [22-23]: Don't award entropy to non-SP800-90B conforming architectural RNGs if fips_enabled is set. - [24]: Move rand_initialize() to after time_init(). A "fix" for what is currently a non-issue, but it's a prerequisite for the subsequent patch. - [25]: Detect cycle counter resolution, subsequently needed for making a per-IRQ entropy assessment. - [26-28]: Follow Stephan's LRNG approach in how much entropy gets awarded to what: a lot more than before to add_interrupt_randomness(), none to add_{disk,input}_randomness() anymore. - [29-33]: Introduce empty health test stubs and wire them up to add_interrupt_randomness(). - [34-36]: Implement the Adaptive Proportion Test (APT) as specified by SP800-90B and squeeze some more statistical power out of it. - [37]: Implement SP800-90B's Repetition Count Test (RCT). - [38-40]: Implement the startup tests, which are nothing but the continuous tests (APT + RCT) run on a specified amount of samples at boot time. - [41]: Attempt to keep the system going in case the entropy estimate had been too optimistic and the health tests keep failing. As the health tests are run from interrupt context on each sample, a performance measurement is due. To this end, I configured a Raspberry Pi 2B (ARMv7 Cortex A7) to disable all peripherals, gated a 19.2 MHz / 2048 ~= 9.3 kHz clock signal to some edge triggered GPIO and function_graph traced add_interrupt_randomness() for 10 min from a busybox initramfs. Unfortunately, the results had been a bit disappointing: with fips_enabled being unset there had been a runtime degradation of ~12.5% w/o SMP and ~5% w/ SMP resp. on average merely due to the application of the patches onto the v5.9-rc4 base. However, as the amount of work should not have changed much and given that struct fast_pool still fits into a single cacheline, I'm optimistic that this can get rectified by e.g. introducing a static_key for fips_enabled and perhaps shuffling branches a bit such that the !fips_enabled code becomes more linear. OTOH, the impact of enabling the health tests by means of setting fips_enabled had not been so dramatic: the observed increase in average add_interrupt_randomness() runtimes had been 6% w/o SMP and 5% w/ SMP respectively. Apart from those well controlled experiments on a RPi, I also did some lax benchmarking on my x86 desktop (which has some Intel i9, IIRC). More specifically, I simply didn't touch the system and ftraced add_interrupt_randomness() for 15 mins. The number of captured events had been about 2000 in each configuration. Here the add_interrupt_randomness() performance improved greatly: from 4.3 us on average w/o the patches down to 2.0 us with the patches applied and fips_enabled. However, I suppose this gain was due to the removal of RDSEED from add_interrupt_randomness(). Indeed, when inspecting the distribution of add_interrupt_randomness() runtimes on plain v5.9-rc4 more closely, it can be seen that there's a good portion of events (about 1/4th) where add_interrupt_randomness() took about 10us. So I think that this comparison isn't really a fair one... To the best of my knowledge, these are the remaining open questions/items towards full SP800-90[A-C] compliance: - There's no (debugfs?) interface for accessing raw samples for validation purposes yet. That would be doable though. - try_to_generate_entropy() should probably get wired up to the health tests as well. More or less straightfoward to implement, too. - Diverting fast_pool contents into net_rand_state is not allowed (for a related discussion on this topic see [7]). - I've been told that SP800-90A is not a hard requirement yet, but I suppose it will eventually become one. This would mean that the chacha20 RNG would have to get replaced by something approved for fips_enabled. - The sequence of fast_pool -> input_pool -> extract_buf() operations is to be considered a "non-vetted conditioning component" in SP800-90B speak. It would follow that the output can't be estimated as having full entropy, but only 0.999 of its length at max. (c.f. sec. 3.1.5.2). This could be resolved by running a SP800-90A derivation function at CRNG reseeding for fips_enabled. extract_buf(), which is already SHA1 based, could perhaps be transformed into such one as well. - The only mention of combining different noise sources I was able to find had been in SP800-90C, sec. 5.3.4 ("Using Multiple Entropy Sources"): it clearly states that the outputs have to be combined by concatenation. add_hwgenerator_randomness() mixes into the same input_pool as add_interrupt_randomness() though and I would expect that this isn't allowed, independent of whether the noise source backing the former is SP800-90B compliant or not. IIUC, Stephan solved this for his LRNG by maintaing a separate pool for the hw generator. - SP800-90A sets an upper bound on how many bits may be drawn from a DRBG/crng before a reseed *must* take place ("reseed_interval"). In principle that shouldn't matter much in practice, at least not with CONFIG_NUMA: with reseed_interval == 2^32 bits, a single CRNG instance would be allowed to hand out only 500MB worth of randomness before reseeding, but a (single) numa crng chained to the primary_crng may produce as much as 8PB before the latter must eventually get reseeded from the input_pool. But AFAICT, a SP800-90A conforming implementation would still have to provide provisions for a blocking extract_crng(). - It's entirely unclear to me whether support for "prediction resistance requests" is optional. It would be a pity if it weren't, because IIUC that would effectively imply a return to the former blocking_pool behaviour, which is obviously a no-no. Nicolai Stange (41): random: remove dead code in credit_entropy_bits() random: remove dead code for nbits < 0 in credit_entropy_bits() random: prune dead assignment to entropy_bits in credit_entropy_bits() random: drop 'reserved' parameter from extract_entropy() random: don't reset entropy to zero on overflow random: factor the exponential approximation in credit_entropy_bits() out random: let pool_entropy_delta() take nbits in units of 2^-ENTROPY_SHIFT random: introduce __credit_entropy_bits_fast() for hot paths random: protect ->entropy_count with the pool spinlock random: implement support for delayed entropy dispatching random: convert add_timer_randomness() to queued_entropy API random: convert add_interrupt_randomness() to queued_entropy API random: convert try_to_generate_entropy() to queued_entropy API random: drop __credit_entropy_bits_fast() random: convert add_hwgenerator_randomness() to queued_entropy API random: convert random_ioctl() to queued_entropy API random: drop credit_entropy_bits() and credit_entropy_bits_safe() random: move arch_get_random_seed() calls in crng_reseed() into own loop random: reintroduce arch_has_random() + arch_has_random_seed() random: provide min_crng_reseed_pool_entropy() random: don't invoke arch_get_random_long() from add_interrupt_randomness() random: introduce arch_has_sp800_90b_random_seed() random: don't award entropy to non-SP800-90B arch RNGs in FIPS mode init: call time_init() before rand_initialize() random: probe cycle counter resolution at initialization random: implement support for evaluating larger fast_pool entropies random: increase per-IRQ event entropy estimate if in FIPS mode random: don't award entropy to disk + input events if in FIPS mode random: move definition of struct queued_entropy and related API upwards random: add a queued_entropy instance to struct fast_pool random: introduce struct health_test + health_test_reset() placeholders random: introduce health test stub and wire it up random: make health_test_process() maintain the get_cycles() delta random: implement the "Adaptive Proportion" NIST SP800-90B health test random: improve the APT's statistical power random: optimize the APT's presearch random: implement the "Repetition Count" NIST SP800-90B health test random: enable NIST SP800-90B startup tests random: make the startup tests include muliple APT invocations random: trigger startup health test on any failure of the health tests random: lower per-IRQ entropy estimate upon health test failure arch/arm64/include/asm/archrandom.h | 33 +- arch/powerpc/include/asm/archrandom.h | 17 +- arch/s390/include/asm/archrandom.h | 19 +- arch/x86/include/asm/archrandom.h | 26 +- drivers/char/random.c | 1141 ++++++++++++++++++++++--- include/linux/random.h | 17 + init/main.c | 2 +- 7 files changed, 1101 insertions(+), 154 deletions(-) -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg), GF: Felix Imendörffer -- 2.26.2