On Wed, Aug 07, 2024 at 05:57:07AM +0200, Mateusz Guzik wrote: [there'll be a separate reply with what I hope might be a usable approach] > Yes, this is my understanding of the code and part of my compliant. :) > > Things just work(tm) as is with NULLified pointers, but this is error-prone. And carrying the arseloads of information (which ones do and which do not need to be dropped) is *less* error-prone? Are you serious? > As a hypothetical suppose there is code executing some time after > vfs_open which looks at nd->path.dentry and by finding the pointer is > NULL it concludes the lookup did not work out. > > If such code exists *and* the pointer is poisoned in the above sense > (notably merely branching on it with kasan already traps), then the > consumer will be caught immediately during coverage testing by > syzkaller. You are much too optimistic about the quality of test coverage in this particular area. > If such code exists but the pointer is only nullified, one is only > going to find out the hard way when some functionality weirdly breaks. To do _useful_ asserts, one needs invariants to check. And "we got to this check after having passed through that assignment at some earlier point" is not it. That's why I'm asking questions about the state. The thing is, suppose I (or you, or somebody else) is trying to modify the whole thing. There's a magical mystery assert in the way; what should be done with it? Move it/split it/remove it/do something random and hope syzkaller won't catch anything? If I can reason about the predicate being checked, I can at least start figuring out what should be done. If not, it's bloody guaranteed to rot. This particular area (pathwalk machinery) has a nasty history of growing complexity once in a while, with following cleanups and massage to get it back into more or less tolerable shape. And refactoring that had been _painful_ - I'd done more than a few there. As far as I can tell, at the moment this flag (and yes, I've seen its removal in the next version) is "we'd called vfs_open_consume() at some point, then found ourselves still in RCU mode or we'd called vfs_open_consume() more than once". This is *NOT* a property of state; it's a property of execution history. The first part is checked in the wrong place - one of the invariants (trivially verified by code examination) is that LOOKUP_RCU is never regained after it had been dropped. The only place where it can be set is path_init() and calling _that_ between path_init() and terminate_walk() would be a) a hard and very visible bug b) would've wiped your flag anyway. So that part of the check is basically "we are not calling vfs_open_consume() under rcu_read_lock()". Which is definitely a desirable property, since ->open() can block. So can mnt_want_write() several lines prior. Invariant here is "the places where we set FMODE_OPENED or FMODE_CREATED may not have LOOKUP_RCU". Having if (!(file->f_mode & (FMODE_OPENED | FMODE_CREATED))) { error = complete_walk(nd); if (error) return error; } in the beginning of do_open() guarantees that for vfs_open() call there. All other places where that can happen are in lookup_open() or called from it (via ->atomic_open() to finish_open()). And *that* definitely should not be done in RCU mode, due to if (open_flag & O_CREAT) inode_lock(dir->d_inode); else inode_lock_shared(dir->d_inode); dentry = lookup_open(nd, file, op, got_write); in the sole caller of that thing. Again, can't grab a blocking lock under rcu_read_lock(). Which is why we have this if (WARN_ON_ONCE(nd->flags & LOOKUP_RCU)) return ERR_PTR(-ECHILD); } else { /* create side of things */ if (nd->flags & LOOKUP_RCU) { if (!try_to_unlazy(nd)) return ERR_PTR(-ECHILD); } slightly prior to that call. WARN_ON_ONCE is basically "lookup_fast() has returned NULL and stayed in RCU mode", which should never happen. try_to_unlazy() is straight "either switch to non-RCU mode or return an error" - that's what this function is for. No WARN_ON after that - it would only obfuscate things. *IF* you want to add debugging checks for that kind of stuff, just call that assert_nonrcu(nd), make it check and whine and feel free to slap them in reasonable amount of places (anything that makes a reader go "for fuck sake, hadn't we (a) done that on the entry to this function and (b) done IO since then, anyway?" is obviously not reasonable, etc. - no more than common sense limitations). Another common sense thing: extra asserts won't confuse syzkaller, but they very much can confuse a human reader. And any rewrites are done by humans... As for the double call of vfs_open_consume()... You do realize that the damn thing wouldn't have reached that check if it would ever have cause to be triggered, right? Seeing that we call static inline struct mnt_idmap *mnt_idmap(const struct vfsmount *mnt) { /* Pairs with smp_store_release() in do_idmap_mount(). */ return smp_load_acquire(&mnt->mnt_idmap); } near the beginning of do_open(), ~20 lines before the place where you added that check... I'm not sure it makes sense to defend against a weird loop appearing out of nowhere near the top of call chain, but if you want to do that, this is not the right place for that.