On Mon, Oct 02, 2023 at 09:10:22AM -0700, Linus Torvalds wrote: > On Sun, 1 Oct 2023 at 19:30, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > That stuff can be accessed by ->d_hash()/->d_compare(); as it is, we have > > a hard-to-hit UAF if rcu pathwalk manages to get into ->d_hash() on a filesystem > > that is in process of getting shut down. > > > > Besides, having nls and upcase table cleanup moved from ->put_super() towards > > the place where sbi is freed makes for simpler failure exits. > > I don't disagree with moving the freeing, but the RCU-delay makes me go "hmm". > > Is there some reason why we can't try to do this in generic code? The > umount code already does RCU delays for other things, I get the > feeling that we should have a RCu delay between "put_super" and > "kkill_sb". > > Could we move the ->kill_sb() call into destroy_super_work(), which is > already RCU-delayed, for example? > > It feels wrong to have the filesystems have to deal with the vfs layer > doing RCU-lookups. For one thing, ->kill_sb() might do tons of IO. And we really want to have that done before umount(2) returns to userland, so that part can't be offloaded via schedule_work()...