Re: [PATCH] libtirpc: handle large numbers of supplemental groups gracefully (try #2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff Layton wrote:
> On Tue, 16 Feb 2010 14:24:09 -0500
> Peter Staubach <staubach@xxxxxxxxxx> wrote:
> 
>> Jeff Layton wrote:
>>> This is the second attempt at this patch. The main changes are that this
>>> one doesn't set a floor value for the size of the group list. There are
>>> also a few minor cleanups and comments added.
>>>
>>> If authunix_create_default() is called by a user with more than 16
>>> supplimental groups, it'll abort(), which causes the program to crash
>>> and coredump.
>>>
>>> Fix it to handle this situation gracefully. Get the number of groups
>>> that the user has first, and then allocate a big enough buffer to hold
>>> them. Then, just don't let the lower function use more than the NGRPS
>>> groups.
>>>
>>> Also fix up the error handling in this function so that it just returns
>>> a NULL pointer on error and logs a message via warnx() instead of
>>> calling abort().
>>>
>>> Reported-by: Peter Engel <peter.engel@xxxxxxx>
>>> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
>>> ---
>>>  src/auth_unix.c |   62 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>>>  1 files changed, 56 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/auth_unix.c b/src/auth_unix.c
>>> index 71ca15d..a295e71 100644
>>> --- a/src/auth_unix.c
>>> +++ b/src/auth_unix.c
>>> @@ -49,6 +49,7 @@
>>>  #include <stdlib.h>
>>>  #include <unistd.h>
>>>  #include <string.h>
>>> +#include <errno.h>
>>>  
>>>  #include <rpc/types.h>
>>>  #include <rpc/xdr.h>
>>> @@ -175,20 +176,69 @@ AUTH *
>>>  authunix_create_default()
>>>  {
>>>  	int len;
>>> +	size_t bufsize = 0;
>>>  	char machname[MAXHOSTNAMELEN + 1];
>>>  	uid_t uid;
>>>  	gid_t gid;
>>> -	gid_t gids[NGRPS];
>>> +	gid_t *gids = NULL;
>>> +	AUTH *auth;
>>> +
>>> +	if (gethostname(machname, sizeof machname) == -1) {
>>> +		warnx("%s: gethostname() failed: %s", __func__,
>>> +			strerror(errno));
>>> +		return NULL;
>>> +	}
>>>  
>>> -	if (gethostname(machname, sizeof machname) == -1)
>>> -		abort();
>>>  	machname[sizeof(machname) - 1] = 0;
>>>  	uid = geteuid();
>>>  	gid = getegid();
>>> -	if ((len = getgroups(NGRPS, gids)) < 0)
>>> -		abort();
>>> +
>>> +retry:
>>> +	len = getgroups(0, NULL);
>>> +	if (len < 0) {
>>> +		warnx("%s: failed to get number of groups: %s", __func__,
>>> +			strerror(errno));
>>> +		return NULL;
>>> +	}
>>> +
>>> +	if (len == 0)
>>> +		goto no_groups;
>>> +
>>> +	bufsize = len * sizeof(gid_t);
>>> +	gids = mem_alloc(bufsize);
>>> +	if (gids == NULL) {
>>> +		warnx("%s: memory allocation failed: %s", __func__,
>>> +			strerror(ENOMEM));
>>> +		return NULL;
>>> +	}
>>> +
>>> +	len = getgroups(len, gids);
>>> +	if (len < 0) {
>>> +		mem_free(gids, bufsize);
>>> +		/*
>>> +		 * glibc equivalent routines mention that it's possible for
>>> +		 * the number of groups to change between two getgroups calls.
>>> +		 * If that happens, retry the whole thing again.
>>> +		 */
>>> +		if (len == -EINVAL) {
>>> +			gids = NULL;
>>> +			bufsize = 0;
>>> +			goto retry;
>>> +		}
>>> +		warnx("%s: failed to get group list: %s", __func__,
>>> +			strerror(errno));
>>> +		return NULL;
>>> +	}
>>> +
>>> +	/* AUTH_UNIX has a hard limit of NGRPS supplemental groups */
>>> +	if (len > NGRPS)
>>> +		len = NGRPS;
>>> +
>>> +no_groups:
>>>  	/* XXX: interface problem; those should all have been unsigned */
>>> -	return (authunix_create(machname, uid, gid, len, gids));
>>> +	auth = authunix_create(machname, uid, gid, len, gids);
>>> +	mem_free(gids, bufsize);
>>> +	return auth;
>>>  }
>>>  
>>>  /*
>> This change to restrict the groups used to the first NGRPS
>> groups is one that we have always avoided.  It can be quite
>> confusing to the user, to have an operation fail, but then
>> to look and notice that the correct group is listed.
>>
>> Having the library abort seems odd and wrong, but this will
>> also change semantics.  Is there really a problem here,
>> after all of these years, that must be addressed?
>>
> 
> Yes, we had a bug report against nfs-utils where someone was seeing
> umount.nfs core dump because root was in >NGRPS groups.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=565507
> 
> ...as Chuck points out, this is basically what glibc has done for
> years. While it may be confusing for users, I think it's more
> preferable than having the programs fail outright.
> 
> What sort of behavior would you propose in its place?
> 

Well, I do agree that a library routine should not call abort().
That's just pure and simple, wrong.  That bug must be fixed.

However, making a policy decision for the user also seems
wrong.  Just because glibc did something does not make it right.
It does tend to make it a standard, so the semantic change
argument becomes weaker, but this change was made to a different
RPC library, so the application must have/be changed to use it,
so all bets are off again.

My inclination would be to return NULL and see who complains.

		ps
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux