> On Apr 15, 2020, at 3:25 PM, Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Wed, Apr 15, 2020 at 01:05:11PM -0400, Chuck Lever wrote: >> Hi Bruce and Jeff: >> >> Testing intensive workloads with NFSv3 and NFSv4.0 on NFS/RDMA with krb5i >> or krb5p results in a pretty quick workload failure. Closer examination >> shows that the client is able to overrun the GSS sequence window with >> some regularity. When that happens, the server drops the connection. >> >> However, when the client retransmits requests with lost replies, they >> never hit in the DRC, and that results in unexpected failures of non- >> idempotent requests. >> >> The retransmitted XIDs are found in the DRC, but the retransmitted request >> has a different checksum than the original. We're hitting the "mismatch" >> case in nfsd_cache_key_cmp for these requests. >> >> I tracked down the problem to the way the DRC computes the length of the >> part of the buffer it wants to checksum. nfsd_cache_csum uses >> >> head.iov_len + page_len >> >> and then caps that at RC_CSUMLEN. >> >> That works fine for krb5 and sys, but the GSS unwrap functions >> (integ_unwrap_data and priv_unwrap_data) don't appear to update head.iov_len >> properly. So nfsd_cache_csum's length computation is significantly larger >> than the clear-text message, and that allows stale parts of the xdr_buf >> to be included in the checksum. >> >> Using xdr_buf_subsegment() at the end of integ_unwrap_data sets the xdr_buf >> lengths properly and fixes the situation for krb5i. >> >> I don't see a similar solution for priv_unwrap_data: there's no MIC len >> available, and priv_len is not the actual length of the clear-text message. >> >> Moreover, the comment in fix_priv_head() is disturbing. I don't see anywhere >> where the relationship between the buf's head/len and how svc_defer works is >> authoritatively documented. It's not clear exactly how priv_unwrap_data is >> supposed to accommodate svc_defer, or whether integ_unwrap_data also needs >> to accommodate it. >> >> So I can't tell if the GSS unwrap functions are wrong or if there's a more >> accurate way to compute the message length in nfsd_cache_csum. I suspect >> both could use some improvement, but I'm not certain exactly what that >> might be. > > I don't know, I tried looking through that code and didn't get any > further than you. The gss unwrap code does look suspect to me. It > needs some kind of proper design, as it stands it's just an accumulation > of fixes. Having recently completed overhauling the client-side equivalents, I agree with you there. I've now convinced myself that because nfsd_cache_csum might need to advance into the first page of the Call message, rq_arg.head.iov_len must contain an accurate length so that csum_partial is applied correctly to the head buffer. Therefore it is the preceding code that needs to set up rq_arg.head.iov_len correctly. The GSS unwrap functions have to do this, and therefore these must be fixed. I would theorize that svc_defer also depends on head.iov_len being set correctly. As far as how rq_arg.len needs to be set for svc_defer, I think I need to have some kind of test case to examine how that path is triggered. Any advice appreciated. -- Chuck Lever