Re: [PATCH] nfsd: Make creates return EEXIST correctly instead of EPERM

Oleg Drokin <green@xxxxxxxxxxxxxx> · Fri, 22 Jul 2016 11:13:20 -0400

On Jul 22, 2016, at 6:55 AM, J. Bruce Fields wrote:

> On Fri, Jul 22, 2016 at 02:35:26AM -0400, Oleg Drokin wrote:
>> 
>> On Jul 21, 2016, at 9:57 PM, J. Bruce Fields wrote:
>> 
>>> On Thu, Jul 21, 2016 at 04:37:40PM -0400, Oleg Drokin wrote:
>>>> 
>>>> On Jul 21, 2016, at 4:34 PM, J. Bruce Fields wrote:
>>>> 
>>>>> On Fri, Jul 08, 2016 at 05:53:19PM -0400, Oleg Drokin wrote:
>>>>>> 
>>>>>> On Jul 8, 2016, at 4:54 PM, J. Bruce Fields wrote:
>>>>>> 
>>>>>>> On Thu, Jul 07, 2016 at 09:47:46PM -0400, Oleg Drokin wrote:
>>>>>>>> It looks like we are bit overzealous about failing mkdir/create/mknod
>>>>>>>> with permission denied if the parent dir is not writeable.
>>>>>>>> Need to make sure the name does not exist first, because we need to
>>>>>>>> return EEXIST in that case.
>>>>>>>> 
>>>>>>>> Signed-off-by: Oleg Drokin <green@xxxxxxxxxxxxxx>
>>>>>>>> ---
>>>>>>>> A very similar problem exists with symlinks, but the patch is more
>>>>>>>> involved, so assuming this one is ok, I'll send a symlink one separately.
>>>>>>>> fs/nfsd/nfs4proc.c |  6 +++++-
>>>>>>>> fs/nfsd/vfs.c      | 11 ++++++++++-
>>>>>>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>>>> 
>>>>>>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>>>>>>> index de1ff1d..0067520 100644
>>>>>>>> --- a/fs/nfsd/nfs4proc.c
>>>>>>>> +++ b/fs/nfsd/nfs4proc.c
>>>>>>>> @@ -605,8 +605,12 @@ nfsd4_create(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>>>>>>> 
>>>>>>>> 	fh_init(&resfh, NFS4_FHSIZE);
>>>>>>>> 
>>>>>>>> +	/*
>>>>>>>> +	 * We just check thta parent is accessible here, nfsd_* do their
>>>>>>>> +	 * own access permission checks
>>>>>>>> +	 */
>>>>>>>> 	status = fh_verify(rqstp, &cstate->current_fh, S_IFDIR,
>>>>>>>> -			   NFSD_MAY_CREATE);
>>>>>>>> +			   NFSD_MAY_EXEC);
>>>>>>>> 	if (status)
>>>>>>>> 		return status;
>>>>>>>> 
>>>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
>>>>>>>> index 6fbd81e..6a45ec6 100644
>>>>>>>> --- a/fs/nfsd/vfs.c
>>>>>>>> +++ b/fs/nfsd/vfs.c
>>>>>>>> @@ -1161,7 +1161,11 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
>>>>>>>> 	if (isdotent(fname, flen))
>>>>>>>> 		goto out;
>>>>>>>> 
>>>>>>>> -	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_CREATE);
>>>>>>>> +	/*
>>>>>>>> +	 * Even though it is a create, first we see if we are even allowed
>>>>>>>> +	 * to peek inside the parent
>>>>>>>> +	 */
>>>>>>>> +	err = fh_verify(rqstp, fhp, S_IFDIR, NFSD_MAY_EXEC);
>>>>>>> 
>>>>>>> Looks like in the v3 case we haven't actually locked the directory yet
>>>>>>> at this point so this check is a little race-prone.
>>>>>> 
>>>>>> In reality this check is not really needed, I suspect.
>>>>>> When we call vfs_create/mknod/mkdir later on, it has it's own permission check
>>>>>> anyway so if there was a race and somebody changed dir access in the middle,
>>>>>> there's going to be another check anyway and it would be caught.
>>>>>> Unless there's some weird server-side permission wiggling as well that makes it
>>>>>> ineffective, but I imagine that one cannot really change in a racy way?
>>>>> 
>>>>> Yeah, I think I'll just change those NFSD_MAY_EXEC's to NFSD_MAY_NOP's.
>>>>> We still need the fh_verify there since it's also what does the
>>>>> filehandle->dentry translation, but we don't need permission checking
>>>>> here yet.
>>>> 
>>>> This will likely need an extra test to ensure that when you
>>>> do mkdir where you do not have exec permissions, you would get EACCES instead
>>>> of EEXIST, otherwise that would be information leakage, no?
>>>> Or do you think the second time we do nfsd_permission, that would be covered?
>>> 
>>> No, you're right, for some reason I thought that the check for a
>>> positive inode didn't happen till later.  But actually the logic is
>>> basically:
>>> 
>>> 	lock inode
>>> 	lookup_one_len
>>> 	return nfserr_exist if looked up dentry is positive.
>>> 	check for create permission
>>> 	vfs_create
>>> 
>>> So, yes, the initial MAY_EXEC test's needed to prevent that information
>>> leak.
>>> 
>>> That said... I wonder why it's done that way?  Seems to me we could just
>>> tremove that nfserr_exist check and the vfs would handle it for us....
>>> I'll try that.
>> 
>> It won't work because the very first thing vfs_create does is may_create(),
>> and so you get EACCES right there instead of the EEXIST.
> 
> static inline int may_create(struct inode *dir, struct dentry *child)
> {
>        audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
>        if (child->d_inode)
>                return -EEXIST;
> 	...
> 
> So it looks OK to me.

Hm, in fact indeed. I was just too worked up about the client side, but on the
server side there was a real lookup already, so it does look workable.

> 
> --b.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html