> On Apr 16, 2023, at 7:51 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Sun, 2023-04-16 at 08:21 +0900, Tetsuo Handa wrote: >> On 2023/04/16 3:40, Jeff Layton wrote: >>> On Sun, 2023-04-16 at 02:11 +0900, Tetsuo Handa wrote: >>>> On 2023/04/16 1:13, Chuck Lever III wrote: >>>>>> On Apr 15, 2023, at 7:07 AM, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: >>>>>> >>>>>> Since GFP_KERNEL is GFP_NOFS | __GFP_FS, usage like GFP_KERNEL | GFP_NOFS >>>>>> does not make sense. Drop __GFP_FS flag in order to avoid deadlock. >>>>> >>>>> The server side threads run in process context. GFP_KERNEL >>>>> is safe to use here -- as Jeff said, this code is not in >>>>> the server's reclaim path. Plenty of other call sites in >>>>> the NFS server code use GFP_KERNEL. >>>> >>>> GFP_KERNEL memory allocation calls filesystem's shrinker functions >>>> because of __GFP_FS flag. My understanding is >>>> >>>> Whether this code is in memory reclaim path or not is irrelevant. >>>> Whether memory reclaim path might hold lock or not is relevant. >>>> >>>> . Therefore, question is, does nfsd hold i_rwsem during memory reclaim path? >>>> >>> >>> No. At the time of these allocations, the i_rwsem is not held. >> >> Excuse me? nfsd_getxattr()/nfsd_listxattr() _are_ holding i_rwsem >> via inode_lock_shared(inode) before kvmalloc(GFP_KERNEL | GFP_NOFS) allocation. >> That's why >> >> /* >> * We're holding i_rwsem - use GFP_NOFS. >> */ >> >> is explicitly there in nfsd_listxattr() side. >> >> If memory reclaim path (directly or indirectly via locking dependency) involves >> inode_lock_shared(inode)/inode_lock(inode), it is not safe to use __GFP_FS flag. >> > > (cc'ing Frank V. who wrote this code and -fsdevel) Frank is no longer at Amazon. I've added his correct address. > I stand corrected! You're absolutely right that it's taking the i_rwsem > for read there. That seems pretty weird, actually. For sure. I certainly didn't expect that. > I don't believe we > need to hold the inode_lock to call vfs_getxattr or vfs_listxattr, and > certainly nothing else under there requires it. > > Frank, was there some reason you decided you needed the inode_lock > there? It looks like under the hood, the xattr code requires you to take > it for write in setxattr and removexattr, but you don't need it at all > in getxattr or listxattr. Go figure. > > If there's no reason to keep it there, then in addition to removing > GFP_NOFS there I think we probably ought to just remove the inode_lock > from nfsd_getxattr and nfsd_listxattr altogether. > > Longer term, I wonder what the inode_lock is protecting in setxattr and > removexattr operations, given that get and list don't require them? > These are always delegated to the filesystem driver -- there is no > generic xattr implementation. > > Maybe we ought to do a lock pushdown on those operations at some point? > I'd bet that most of the setxattr/removexattr operations do their own > locking and don't require the i_rwsem at all. > -- > Jeff Layton <jlayton@xxxxxxxxxx> -- Chuck Lever