Re: [PATCH v2 0/3] NFSD DRC fixes for v6.7-rc

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Sat, 11 Nov 2023 16:09:34 +0000

> On Nov 11, 2023, at 7:42 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> On Fri, 2023-11-10 at 11:28 -0500, Chuck Lever wrote:
>> I've chased down the reported failures with krb5i/p. Essentially
>> the DRC was broken for krb5i/p because of the extra crap those
>> security flavors stick before and after the NFS headers.
> 
> Huh, so the reported failures were due to retransmissions?

Good question. This cover letter leaves out some of the story.

I noticed test failures when running the git regression suite
with 12 threads on a sec=krb5i mount point, proto=rdma. The
failure was 100% reproducible. I saw some retransmits in the
mountstats error counters.

The initial trigger of the retransmissions is GSS sequence
window underruns, which are rather frequent with this test
set-up. The server is mandated to drop the request and
connection after each detected underrun.

At first I thought that getting rid of the underruns would
fix the issue, but then realized that that's not possible to
do completely. Then I realized that a simple retransmit is not
supposed to result in vastly different application behavior
(at least most of the time) and so dug into that.

The hole I crawled out of at the end was that the DRC was
busted. And not just one bug, but three separate issues.
Once all three are addressed, the test passes on krb5i and
krb5p.

>> These patches do address bugs in stable kernels, but they will need
>> to be massaged, tested, and applied by hand to get there. Voting now
>> open on whether it's worth the trouble for us to do it now, or wait
>> until someone complains about a particular LTS problem.

I haven't been able to successfully build a kernel older
than about v5.18 on my Fedora 38 systems. I'm thinking that,
in general, we could test stable backports using kdevops
with older Fedora vagrant boxes.

But I haven't gotten the full kdevops environment to run
here yet, and thus don't have a git regression workflow. And
I guess we don't have RDMA working in this environment yet
either. (siw might not work well or at all on older kernels).

Maybe the connection losses could be triggered instead with
disconnect injection.

>> ---
>> 
>> Chuck Lever (3):
>>      NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()
>>      NFSD: Update nfsd_cache_append() to use xdr_stream
>>      NFSD: Fix checksum mismatches in the duplicate reply cache
>> 
>> 
>> fs/nfsd/cache.h    |  4 +--
>> fs/nfsd/nfscache.c | 87 +++++++++++++++++++++++++++-------------------
>> fs/nfsd/nfssvc.c   | 14 ++++++--
>> 3 files changed, 65 insertions(+), 40 deletions(-)
>> 
>> --
>> Chuck Lever
>> 
> 
> Wow, nice work tracking those down. I'm sure that wasn't easy!

The hardest part was not giving up.

> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>

Much appreciated!

--
Chuck Lever