Luis Henriques <lhenriques@xxxxxxx> writes: > Eric Biggers <ebiggers@xxxxxxxxxx> writes: > >> On Tue, Nov 28, 2023 at 02:16:44PM +0000, Luis Henriques wrote: >>> > >>> > Yeah, looking closer it makes sense. Sorry for the noise. I'm currently >>> > investigating a test failure (which I can't reproduce locally) where >>> > 'orig_key' unexpectedly is set to '1' and makes the test fail because it >>> > was supposed to be '0'. That's when this caught my attention. Anyway, >>> > I'll go look somewhere else. >>> >>> OK, I'm not 100% sure yet, but I've an idea about what's going on with >>> this test failure. >>> >>> I _think_ that even after the following is done in the test: >>> >>> _user_do_rm_enckey $SCRATCH_MNT $keyid >>> _scratch_cycle_mount >>> >>> the key garbage collector may not have finish running. And then, when we >>> read '/proc/key-users', we can race against key_user_put(), which needs >>> key_user_lock, which is also grabbed while the proc file seq_operations >>> are run. >>> >>> Eric, does this make any sense? There is a loop in the test to wait for >>> invalidated keys, but I believe it's not relevant anymore since commit >>> d7e7b9af104c ("fscrypt: stop using keyrings subsystem for >>> fscrypt_master_key"). But I might be misunderstanding the code. >> >> Thanks for looking into this! I had noticed this test is still flaky on arm64 >> but haven't had a chance to look into it. Yes, it's probably related to the key >> garbage collector again. The test needs to wait for the fscrypt "user" keys >> (key_type_fscrypt_user in the kernel) to be released from the quota. I think >> that loop in the test does not have the intended effect because it waits for >> "invalidated" keys, but the fscrypt "user" keys (which are charged to the quota) >> are never invalidated; they're just released normally. There used to be another >> key (in the "keyrings" subsystem sense of the word "key") associated with each >> fscrypt key, and that key was indeed invalidated, but that's no longer the case. >> > > Awesome, thanks for confirming this. That loop probably made sense back > when keys were invalidated -- that behaviour was changed by the commit I > mentioned, I believe. Anyway, it's probably better to keep it the loop > for testing old kernels, as it doesn't really hurt. > >> >> Maybe there's a better way to wait for the key garbage collector to >> finish. >> >> Or the kernel could be changed to make releasing the key quota be synchronous. >> Unfortunately the keyrings subsystem doesn't seem to work that way, and fscrypt >> is tying into the key quota from the keyrings subsystem, so it is subject to its >> limitations. But maybe there's a way to do it. > > Hmm... yeah, that requires a closer look at that subsystem to see if > something can be done. I'll try to find something there that would make > that test more reliable. OK, I've finally spent some time looking at this and I see two options to fix this. - Option 1: Add some logic to the test to try to close this race. The loop that adds keys and expects the last key to fail could be changed to update the keys quota if the number of keys in '/proc/key-users' isn't the one it was expected after adding a new one. It's hacky and still racy, as the GC could still run in between. I've an attempt to implement this, which _seems_ to be working, but I haven't spent too much time testing it, because... it's ugly and option 2 looks better. - Option 2: Modify the '/proc/key-users' ->start() handler so that the first thing it does is to call flush_work() and force the GC to run. This doesn't completely close the race window, but I think it makes things a bit more reliable to userspace scripts/applications relying on the data. Also, this flushing should probably be done for '/proc/keys' as well, but the patch I've been hammering is doing it for '/proc/key-users' only. So far, it seems to do the job and I can't reproduce the test failure with it applied. Obviously, forcing GC to run will have a performance impact, but I'm also not sure how bad that would be and how much userspace expects a read to these procfs files to perform. Anyway, just for reference, I'm attaching bellow the patch I've been testing. Let me know what you think about it. I'm still running more tests, but if the patch looks good to you I can send out a proper RFC. Cheers, -- Luís diff --git a/security/keys/gc.c b/security/keys/gc.c index 3c90807476eb..57b5a54490a0 100644 --- a/security/keys/gc.c +++ b/security/keys/gc.c @@ -44,6 +44,12 @@ struct key_type key_type_dead = { .name = ".dead", }; +void key_flush_gc(void) +{ + kenter(""); + flush_work(&key_gc_work); +} + /* * Schedule a garbage collection run. * - time precision isn't particularly important diff --git a/security/keys/internal.h b/security/keys/internal.h index 471cf36dedc0..fee1d0025d96 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -170,6 +170,7 @@ extern void keyring_restriction_gc(struct key *keyring, extern void key_schedule_gc(time64_t gc_at); extern void key_schedule_gc_links(void); extern void key_gc_keytype(struct key_type *ktype); +extern void key_flush_gc(void); extern int key_task_permission(const key_ref_t key_ref, const struct cred *cred, diff --git a/security/keys/proc.c b/security/keys/proc.c index d0cde6685627..2837e00a240a 100644 --- a/security/keys/proc.c +++ b/security/keys/proc.c @@ -277,6 +277,7 @@ static void *proc_key_users_start(struct seq_file *p, loff_t *_pos) struct rb_node *_p; loff_t pos = *_pos; + key_flush_gc(); spin_lock(&key_user_lock); _p = key_user_first(seq_user_ns(p), &key_user_tree);