Re: [PATCH] nfsd: don't use GFP_KERNEL from nfsd_getxattr()/nfsd_listxattr()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 19, 2023, at 7:32 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> 
> On Wed, Apr 19, 2023 at 01:51:12PM +0000, Chuck Lever III wrote:
>> 
>> 
>>> On Apr 16, 2023, at 7:37 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>>> 
>>> On Sun, Apr 16, 2023 at 07:51:41AM -0400, Jeff Layton wrote:
>>>> On Sun, 2023-04-16 at 08:21 +0900, Tetsuo Handa wrote:
>>>>> On 2023/04/16 3:40, Jeff Layton wrote:
>>>>>> On Sun, 2023-04-16 at 02:11 +0900, Tetsuo Handa wrote:
>>>>>>> On 2023/04/16 1:13, Chuck Lever III wrote:
>>>>>>>>> On Apr 15, 2023, at 7:07 AM, Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>> 
>>>>>>>>> Since GFP_KERNEL is GFP_NOFS | __GFP_FS, usage like GFP_KERNEL | GFP_NOFS
>>>>>>>>> does not make sense. Drop __GFP_FS flag in order to avoid deadlock.
>>>>>>>> 
>>>>>>>> The server side threads run in process context. GFP_KERNEL
>>>>>>>> is safe to use here -- as Jeff said, this code is not in
>>>>>>>> the server's reclaim path. Plenty of other call sites in
>>>>>>>> the NFS server code use GFP_KERNEL.
>>>>>>> 
>>>>>>> GFP_KERNEL memory allocation calls filesystem's shrinker functions
>>>>>>> because of __GFP_FS flag. My understanding is
>>>>>>> 
>>>>>>> Whether this code is in memory reclaim path or not is irrelevant.
>>>>>>> Whether memory reclaim path might hold lock or not is relevant.
>>>>>>> 
>>>>>>> . Therefore, question is, does nfsd hold i_rwsem during memory reclaim path?
>>>>>>> 
>>>>>> 
>>>>>> No. At the time of these allocations, the i_rwsem is not held.
>>>>> 
>>>>> Excuse me? nfsd_getxattr()/nfsd_listxattr() _are_ holding i_rwsem
>>>>> via inode_lock_shared(inode) before kvmalloc(GFP_KERNEL | GFP_NOFS) allocation.
>>>>> That's why
>>>>> 
>>>>> /*
>>>>> * We're holding i_rwsem - use GFP_NOFS.
>>>>> */
>>>>> 
>>>>> is explicitly there in nfsd_listxattr() side.
>>> 
>>> You can do GFP_KERNEL allocations holding the i_rwsem just fine.
>>> All that it requires is the caller holds a reference to the inode,
>>> and at that point inode will should skip the given inode without
>>> every locking it.
>> 
>> This suggests that the fix is to replace "GFP_KERNEL | GFP_NOFS"
>> with "GFP_KERNEL" /and/ ensure those paths are holding an
>> appropriate inode reference.
> 
> If the code that provided the inode to nfsd_listxattr() did not
> already have an active inode reference in the first place then there
> are much, much bigger UAF problems to worry about than simple
> memory reclaim deadlocks.
> 
> That said, nfsd_listxattr() does:
> 
>        dentry = fhp->fh_dentry;
>        inode = d_inode(dentry);
>        *lenp = 0;
> 
>        inode_lock_shared(inode);
> 
>        len = vfs_listxattr(dentry, NULL, 0);
> 
> Given that a dentry pointing to an inode *must* hold an active
> reference to that inode, I don't see how it is possible this code
> path could be using an unreferenced inode.
> 
> nfsd_getxattr() has a similar code fragment to obtain the inode as
> well, so same goes for that...

Dave, thanks for handling the due diligence! I was not 100% sure
about code that handles xattrs rather than the primary byte stream
of a file.

Tetsuo, you can send a v2, or just let me know and I will make
a patch to correct the GFP flags.

--
Chuck Lever






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux