Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> writes: > On Sun, Oct 03, 2021 at 08:14:13PM +0200, Nicolai Stange wrote: >> crypto_alg_mod_lookup() invokes the crypto_larval_lookup() helper >> to run the actual search for matching crypto_alg implementation and >> larval entries. The latter is currently considering only the individual >> entries' relative ->cra_priority for determining which one out of multiple >> matches to return. This means that it would potentially dismiss a matching >> crypto_alg implementation in working state in favor of some pending >> testing larval of higher ->cra_priority. Now, if the testmgr instance >> invoked asynchronously on that testing larval came to the conclusion that >> it should mark the tests as failed, any pending crypto_alg_mod_lookup() >> waiting for it would be made to fail as well with -EAGAIN. >> >> In summary, crypto_alg_mod_lookup() can fail spuriously with -EAGAIN even >> though an implementation in working state would have been available, namely >> if the testmgr asynchronously marked another, competing implementation of >> higher ->cra_priority as failed. >> >> This is normally not a problem at all with upstream, because the situation >> where one algorithm passed its tests, but another competing one failed to >> do so, would indicate a bug anyway. >> >> However, for downstream distributions seeking FIPS certification, simply >> amending the list in crypto/testmgr.c with ->fips_allowed = 0 entries >> matching on ->cra_driver_name would provide a convenient way of >> selectively blacklisting implementations from drivers/crypto in fips >> mode. Note that in this scenario failure of competing crypto_alg >> implementations would become more common, in particular during device >> enumeration. If the algorithm in question happened to be needed for e.g. >> module signature verification, module loading could spuriously fail during >> bootup, which is certainly not desired. >> >> For transparency: this has not actually been observed, I merely came to >> the conclusion that it would be possible by reading the code. >> >> Make crypto_alg_lookup() run an additional search for non-larval matches >> upfront in the common case that the request has been made for >> CRYPTO_ALG_TESTED instances. >> >> Signed-off-by: Nicolai Stange <nstange@xxxxxxx> >> --- >> crypto/api.c | 21 ++++++++++++++++++++- >> 1 file changed, 20 insertions(+), 1 deletion(-) Hi Herbert, > It's not clear that this new behaviour is desirable. For example, > when we construct certain complex algorithms, they may depend on a > generic version of that same algorithm as a fallback. We do not > want users to get the generic version while the better version is > being tested. Ah I see, you mean something like having 3+ providers of "algXY", - algXY_driver0, ->cra_priority = 100 - algXY_driver1, ->cra_priority = 200 - algXY_driver1, ->cra_priority = 300 where the latter needs a different "algXY" as a fallback? Hmm yes, I haven't thought of this and my patch here would indeed break it. > Can you please explain what your failure scenario and perhaps we > can come up with another way of resolving your problem? In order to keep a FIPS certification manageable in terms of scope, we're looking for a way to disable everything under drivers/crypto iff fips_enabled == 1. The most convenient way to achieve this downstream would be to add dummy entries to testmgr.c like so: static int alg_test_nop(const struct alg_test_desc *desc, const char *driver, u32 type, u32 mask) { /* Succeed in non-FIPS mode. */ return 0; } static const struct alg_test_desc alg_test_descs[] = { ..., { .alg = "sha256-padlock-nano", .test = alg_test_nop, .fips_allowed = 0, }, ... }; The concern is about e.g the following sequence of events during boot: - "sha256-padlock-nano" gets registered, the test gets scheduled. - An unrelated modprobe is getting invoked. The signature verification code, e.g pkcs7_digest(), requests "sha256", finds the pending "sha256-padlock-nano" testing larval and puts itself in a wait for it. - The scheduled "sha256-padlock-nano" test gets to run and, as per ->fips_allowed == 0, is forced to fail with -EINVAL. - pkcs7_digest() wakes up and fails with -EAGAIN due to the "failed" testing larval. - The unrelated modprobe fails even though sha256-generic would have been available all the time. FWIW, I picked sha256-padlock-nano and modprobe at random for the sake of providing an example here -- I haven't checked in detail, but I guess that e.g. the combination of dm-crypt + a number of different FIPS approved algorithms could be similarly susceptible, too. So to recap, my initial approach for working around the above was to make crypto_alg_lookup() to skip over testing larvals in an additional, first search. As you said, this would break the "fallback" scenario though. As an alternative, how about not doing the additional search for non-larvals upfront, but only as a resort in case crypto_larval_wait() returned failure instead? But granted, the scenario above is not an issue at all for upstream with a pristine testmgr.c and it would be quite relatable if you wouldn't like to get bothered with any of this. I'm only bringing it up because others might perhaps come across this as well... Thanks! Nicolai -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg), GF: Felix Imendörffer