On Fri, Apr 17, 2020 at 05:48:35PM -0400, Chuck Lever wrote: > I've hit a snag here. > > I reverted 241b1f419f0e on my server, and all tests completed > successfully. > > I reverted 241b1f419f0e on my client, and now krb5p is failing. Using > xdr_buf_trim does the right thing on the server, but on the client it > has exposed a latent bug in gss_unwrap_resp_priv() (ie, the bug does > not appear to be harmful until 241b1f419f0e has been reverted). > > The calculation of au_ralign in that function is wrong, and that forces > rpc_prepare_reply_pages to allocate a zero-length tail. xdr_buf_trim() > then lops off the end of each subsequent clear-text RPC message, and > eventually a short READ results in test failures. > > After experimenting with this for a day, I don't see any way for > gss_unwrap_resp_priv() to estimate au_ralign based on what gss_unwrap() > has done to the xdr_buf. The kerberos_v1 unwrap method does not appear > to have any trailing checksum, for example, but v2 does. > > The best approach for now seems to be to have the pseudoflavor-specific > unwrap methods return the correct ralign value. A straightforward way > to do this would be to add a *int parameter to ->gss_unwrap that would > be set to the proper value; or hide that value somewhere in the xdr_buf. > > Any other thoughts or random bits of inspiration? No. Among other things, a quick skim wasn't enough to remind me what au_ralign is and why we have both that and au_rslack.... Sorry! I've got not much to offer but sympathy. ... I have a random thought out of left field: after the xdr_stream conversion, fs/nfs/nfs4xdr.c mostly doesn't deal directly with the reply buffer any more. It calls xdr_inline_decode(xdr, n) and gets back a pointer to the next n bytes of rpc data. Or it calls xdr_read_pages() after which read data's supposed to be moved to the right place. Would it be possible to delay rpcsec_gss decoding until then? Would that make things simpler or more complicated? Eh, I think the answer is probably "more complicated". --b.