Re: [PATCH 6/6] NFSv4: allow getacl rpc to allocate pages on demand

Andreas Gruenbacher <agruenba@xxxxxxxxxx> · Thu, 23 Feb 2017 11:28:46 +0100

On Wed, Feb 22, 2017 at 2:53 AM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
> On Tue, Feb 21, 2017 at 10:45:35PM +0100, Andreas Gruenbacher wrote:
>> On Tue, Feb 21, 2017 at 10:37 PM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
>> > On Tue, Feb 21, 2017 at 10:21:05PM +0100, Andreas Gruenbacher wrote:
>> >> On Tue, Feb 21, 2017 at 7:46 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>> >> > Hi Andreas-
>> >> >
>> >> >
>> >> >> On Feb 20, 2017, at 4:31 PM, Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote:
>> >> >>
>> >> >> On Mon, Feb 20, 2017 at 6:15 PM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
>> >> >>> On Mon, Feb 20, 2017 at 11:42:31AM -0500, Chuck Lever wrote:
>> >> >>>>
>> >> >>>>> On Feb 20, 2017, at 11:09 AM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
>> >> >>>>>
>> >> >>>>> On Sun, Feb 19, 2017 at 02:29:03PM -0500, Chuck Lever wrote:
>> >> >>>>>>
>> >> >>>>>>> On Feb 18, 2017, at 9:07 PM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
>> >> >>>>>>>
>> >> >>>>>>> From: Weston Andros Adamson <dros@xxxxxxxxxx>
>> >> >>>>>>>
>> >> >>>>>>> Instead of preallocating pags, allow xdr_partial_copy_from_skb() to
>> >> >>>>>>> allocate whatever pages we need on demand.  This is what the NFSv3 ACL
>> >> >>>>>>> code does.
>> >> >>>>>>
>> >> >>>>>> The patch description does not explain why this change is
>> >> >>>>>> being done.
>> >> >>>>>
>> >> >>>>> The only justification I see is avoiding allocating pages unnecessarily.
>> >> >>>>
>> >> >>>> That makes sense. Is there a real world workload that has seen
>> >> >>>> a negative effect?
>> >> >>>>
>> >> >>>>
>> >> >>>>> Without this patch, for each getacl, we allocate 17 pages (if I'm
>> >> >>>>> calculating correctly) and probably rarely use most of them.
>> >> >>>>>
>> >> >>>>> In the v3 case I think it's 7 pages instead of 17.
>> >> >>>>
>> >> >>>> I would have guessed 9. Out of curiosity, is there a reason
>> >> >>>> documented for these size limits?
>> >> >>>
>> >> >>>
>> >> >>> In the v4 case:
>> >> >>>
>> >> >>>        #define NFS4ACL_MAXPAGES DIV_ROUND_UP(XATTR_SIZE_MAX, PAGE_SIZE)
>> >> >>>
>> >> >>> And I believe XATTR_SIZE_MAX is a global maximum on the size of any
>> >> >>> extend attribute value.
>> >> >>
>> >> >> XATTR_SIZE_MAX is the maximum size of an extended attribute. NFSv4
>> >> >> ACLs are passed through unchanged in "system.nfs4_acl".
>> >> >
>> >> > "Extended attribute" means this is a Linux-specific limit?
>> >>
>> >> Yes.
>> >>
>> >> > Is there anything that prevents a non-Linux system from constructing
>> >> > or returning an ACL that is larger than that?
>> >>
>> >> No.
>> >
>> > In the >=v4.1 case there are session limits, but they'll typically be
>> > less.  In the 4.0 case I think there's no explicit limit at all.  In
>> > practice I bet other systems are similar to Linux in that the assume
>> > peers won't send rpc replies or requests larger than about the
>> > maximum-sized read or write.  But again that'll usually be a higher
>> > limit than our ACL limit.
>> >
>> >> > What happens on a Linux client when a server returns an ACL that does
>> >> > not fit in this allotment?
>> >>
>> >> I would hope an error, but I haven't tested it.
>> >
>> > I haven't tested either, but it looks to me like the rpc layer receives
>> > a truncated request, the xdr decoding recognizes that it's truncated,
>> > and the result is an -ERANGE.
>> >
>> > Looking now I think that my "NFSv4: simplify getacl decoding" changes
>> > that to an -EIO.  More importantly, it makes that an EIO even when the
>> > calling application was only asking for the length, not the actual ACL
>> > data.  I'll fix that.
>>
>> Just be careful not to return a length from getxattr(path, name, NULL,
>> 0) that will cause getxattr(path, name, buffer, size) to fail with
>> ERANGE, please. Otherwise, user space might get very confused.
>
> Ugh, OK.  So there could be userspace code that does something like
>
>         while (getxattr(path, name, buf, size) == -ERANGE) {
>                 /* oops, must have raced with a size change */
>                 size = getxattr(path, name, NULL, 0);
>                 buf = realloc(buf, size);
>         }
>
> and you'd consider that a kernel bug not a userspace bug?

It would at least provoke errors if the above loop (with an additional
check for size == -1) didn't terminate, so I'd like to avoid that. I
see now that there is botched code in fs/xattr.c that tries to prevent
that, so I'll try to fix that so that file systems won't have to
bother.

> I suspect that can happen both before and after my changes.
>
> So what do we want for that case?  Just -EIO?

getxattr and listxattr are trying to cast that kind of error to
-E2BIG, which seems okay.

Thanks,
Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html