On Fri, May 29 2020, J. Bruce Fields wrote: > On Fri, May 29, 2020 at 10:53:15AM +1000, NeilBrown wrote: >> I've received a report of a 5.3 kernel crashing in >> nfs4_show_superblock(). >> I was part way through preparing a patch when I concluded that >> the problem wasn't as straight forward as I thought. >> >> In the crash, the 'struct file *' passed to nfs4_show_superblock() >> was NULL. >> This file was acquired from find_any_file(), and every other caller >> of find_any_file() checks that the returned value is not NULL (though >> one BUGs if it is NULL - another WARNs). >> But nfs4_show_open() and nfs4_show_lock() don't. >> Maybe they should. I didn't double check, but I suspect they don't >> hold enough locks to ensure that the files don't get removed. > > I think the only lock held is cl_lock, acquired in states_start. > > We're starting here with an nfs4_stid that was found in the cl_stateids > idr. > > A struct nfs4_stid is freed by nfs4_put_stid(), which removes it from > that idr under cl_lock before freeing the nfs4_stid and anything it > points to. > > I think that was the theory.... > > One possible problem is downgrades, like nfs4_stateid_downgrade. > > I'll keep mulling it over, thanks. I had another look at code and maybe move_to_close_lru() is the problem. It can clear remove the files and clear sc_file without taking cl_lock. So some protection is needed against that. I think that only applies to nfs4_show_open() - not show_lock etc. But I wonder it is might be best to include some extra protection for each different case, just in case some future code change allow sc_file to become NULL before the state is detached. I'd feel more comforatable about nfs4_show_superblock() if it ignored nf_inode and just used nf_file - it is isn't NULL. It looks like it can never be set from non-NULL to NULL. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature