Re: acls+kerberos (limitation)

Olga Kornievskaia <aglo@xxxxxxxxx> · Fri, 20 Dec 2019 15:53:08 -0500

On Fri, Dec 20, 2019 at 3:11 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
>
>
> > On Dec 20, 2019, at 3:04 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> >
> > On Fri, Dec 20, 2019 at 1:28 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>
> >>
> >>
> >>> On Dec 20, 2019, at 1:15 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> >>>
> >>> On Wed, Dec 18, 2019 at 2:34 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Dec 18, 2019, at 2:31 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> >>>>>
> >>>>> On Wed, Dec 18, 2019 at 2:05 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>> On Wed, 2019-12-18 at 12:47 -0500, Olga Kornievskaia wrote:
> >>>>>>> Hi folks,
> >>>>>>>
> >>>>>>> Is this a well know but undocumented fact that you can't set large
> >>>>>>> amount of acls (over 4096bytes, ~90acls) while mounted using
> >>>>>>> krb5i/krb5p? That if you want to get/set large acls, it must be done
> >>>>>>> over auth_sys/krb5?
> >>>>>>>
> >>>>>>
> >>>>>> It's certainly not something that I was aware of. Do you see where that
> >>>>>> limitation is coming from?
> >>>>>
> >>>>> I haven't figure it exactly but gss_unwrap_resp_integ() is failing in
> >>>>> if (mic_offset > rcv_buf->len). I'm just not sure who sets up the
> >>>>> buffer (or why  rvc_buf->len is (4280) larger than a page can a
> >>>>> page-limit might make sense to for me but it's not). So you think it
> >>>>> should have been working.
> >>>>
> >>>> The buffer is set up in the XDR encoder. But pages can be added
> >>>> by the transport... I guess rcv_buf->len isn't updated when that
> >>>> happens.
> >>>>
> >>>
> >>> Here's why the acl+krbi/krb5p is failing.
> >>>
> >>> acl tool first calls into the kernel to find out how large of a buffer
> >>> it needs to supply and gets acl size then calls down again then code
> >>> in __nfs4_get_acl_uncached() allocates a number of pages (this what
> >>> set's the available buffer length later used by the sunrpc code). That
> >>> works for non-integrity because in call_decode() the call
> >>> rpc_unwrap_resp() doesn't try to calculate the checksum on the buffer
> >>> that was just read. However, when its krb5i/krb5p we have truncated
> >>> buffer and mic offset that's larger than the existing buffer.
> >>>
> >>> I think something needs to be marked to skip doing gss for the initial
> >>> acl query?  I first try providing more space in
> >>> __nfs4_get_acl_uncached() for when authflavor=krb5i/krb5p and buflen=0
> >>> but no matter what the number is the received acl can be larger than
> >>> that thus I don't think that's a good approach.
> >>
> >> It's not strictly true that the received ACL can be always be larger.
> >> There is an upper bound on request sizes.
> >>
> >> My preference has always been to allocate a receive buffer of the maximum
> >> size before the call, just like every other request works. I can't think
> >> of any reason why retrieving an ACL has to be different. Then we can get
> >> rid of the hack in the transports to fill in those pages behind the back
> >> of the upper layers.
> >>
> >> The issue here has always been that there's no way for the client to
> >> discover the number of bytes it needs to retrieve before it sets up the
> >> GETACL.
> >>
> >> For NFSv4.1+ you can probably assume that the ACL will never be larger
> >> than the session's maximum reply size.
> >>
> >> For NFSv4.0 you'll have to make something up.
> >>
> >> But allocating a large receive buffer for this request is the only way to
> >> make the receive reliable. You should be able to do that by stuffing the
> >> recv XDR buffer with individual pages, just like nfsd does, in GETACL's
> >> encoding function.
> >>
> >> Others might have a different opinion. Or I might have completely
> >> misunderstood the issue.
> >>
> >
> > Putting a limit would be easier. I thought of using rsize (wsize) as
> > we can't get anything larger than in the payload that but that's not
> > possible. Because the code sets limits based on XATTR_MAX_SIZE which
> > is a linux server side limitation and it doesn't seem to be
> > appropriate to be applied as a generic implementation. Would it be ok
> > to change the static memory allocation to be dynamic and based on the
> > rsize? Thoughts?
>
> Why is using the NFSv4.1 session max reply size not possible? For
> NFSv4.0, rsize seems reasonable to me.

It's not possible because there is a hard limit of number of pages the
code will allocate (right now).

static ssize_t __nfs4_get_acl_uncached(struct inode *inode, void *buf,
size_t buflen)
{
        struct page *pages[NFS4ACL_MAXPAGES + 1] = {NULL, };

NFS4ACL_MAXPAGES are based on the 64K limit (from the XATTR_MAX_SIZE).

        if (npages > ARRAY_SIZE(pages))
                return -ERANGE;

Typically session size (or r/wsizes) are something like 262K or 1M.

I was just saying that I'd then would need to remove the static
structure for pages and make it dynamic based on the (rsize or session
size). I thought that r/wsize was set to whatever the session sizes
are so using the r/wsize values would make it work for both 4.0 and
4.1+.

>
> --
> Chuck Lever
>
>
>