> On Apr 19, 2023, at 7:32 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Wed, Apr 19, 2023 at 01:51:12PM +0000, Chuck Lever III wrote: >> >> >>> On Apr 16, 2023, at 7:37 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >>> >>> On Sun, Apr 16, 2023 at 07:51:41AM -0400, Jeff Layton wrote: >>>> On Sun, 2023-04-16 at 08:21 +0900, Tetsuo Handa wrote: >>>>> On 2023/04/16 3:40, Jeff Layton wrote: >>>>>> On Sun, 2023-04-16 at 02:11 +0900, Tetsuo Handa wrote: >>>>>>> On 2023/04/16 1:13, Chuck Lever III wrote: >>>>>>>>> On Apr 15, 2023, at 7:07 AM, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> Since GFP_KERNEL is GFP_NOFS | __GFP_FS, usage like GFP_KERNEL | GFP_NOFS >>>>>>>>> does not make sense. Drop __GFP_FS flag in order to avoid deadlock. >>>>>>>> >>>>>>>> The server side threads run in process context. GFP_KERNEL >>>>>>>> is safe to use here -- as Jeff said, this code is not in >>>>>>>> the server's reclaim path. Plenty of other call sites in >>>>>>>> the NFS server code use GFP_KERNEL. >>>>>>> >>>>>>> GFP_KERNEL memory allocation calls filesystem's shrinker functions >>>>>>> because of __GFP_FS flag. My understanding is >>>>>>> >>>>>>> Whether this code is in memory reclaim path or not is irrelevant. >>>>>>> Whether memory reclaim path might hold lock or not is relevant. >>>>>>> >>>>>>> . Therefore, question is, does nfsd hold i_rwsem during memory reclaim path? >>>>>>> >>>>>> >>>>>> No. At the time of these allocations, the i_rwsem is not held. >>>>> >>>>> Excuse me? nfsd_getxattr()/nfsd_listxattr() _are_ holding i_rwsem >>>>> via inode_lock_shared(inode) before kvmalloc(GFP_KERNEL | GFP_NOFS) allocation. >>>>> That's why >>>>> >>>>> /* >>>>> * We're holding i_rwsem - use GFP_NOFS. >>>>> */ >>>>> >>>>> is explicitly there in nfsd_listxattr() side. >>> >>> You can do GFP_KERNEL allocations holding the i_rwsem just fine. >>> All that it requires is the caller holds a reference to the inode, >>> and at that point inode will should skip the given inode without >>> every locking it. >> >> This suggests that the fix is to replace "GFP_KERNEL | GFP_NOFS" >> with "GFP_KERNEL" /and/ ensure those paths are holding an >> appropriate inode reference. > > If the code that provided the inode to nfsd_listxattr() did not > already have an active inode reference in the first place then there > are much, much bigger UAF problems to worry about than simple > memory reclaim deadlocks. > > That said, nfsd_listxattr() does: > > dentry = fhp->fh_dentry; > inode = d_inode(dentry); > *lenp = 0; > > inode_lock_shared(inode); > > len = vfs_listxattr(dentry, NULL, 0); > > Given that a dentry pointing to an inode *must* hold an active > reference to that inode, I don't see how it is possible this code > path could be using an unreferenced inode. > > nfsd_getxattr() has a similar code fragment to obtain the inode as > well, so same goes for that... Dave, thanks for handling the due diligence! I was not 100% sure about code that handles xattrs rather than the primary byte stream of a file. Tetsuo, you can send a v2, or just let me know and I will make a patch to correct the GFP flags. -- Chuck Lever