Thanks for the pointer Chuck I ran the trace capture and was able to trigger two freezes in relatively short order. Here is that trace file: https://drive.google.com/file/d/1qk_VIMkj8aTeQIg5O0W3AvWyeDSWL3vW/view?usp=sharing - mike On Mon, Aug 16, 2021 at 6:21 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > > > On Aug 15, 2021, at 9:20 PM, NeilBrown <neilb@xxxxxxx> wrote: > > > > On Sun, 15 Aug 2021, Mike Javorski wrote: > >> I managed to get a cap with several discreet freezes in it, and I > >> included a chunk with 5 of them in the span of ~3000 frames. I added > >> packet comments at each frame that the tshark command reported as > 1 > >> sec RPC wait. Just search for "Freeze" in (wire|t)shark in packet > >> details. This is with kernel 5.13.10 provided by Arch (See > >> https://github.com/archlinux/linux/compare/a37da2be8e6c85...v5.13.10-arch1 > >> for diff vs mainline, nothing NFS/RPC related I can identify). > >> > >> I tried unsuccessfully to get any failures with the 5.12.15 kernel. > >> > >> https://drive.google.com/file/d/1T42iX9xCdF9Oe4f7JXsnWqD8oJPrpMqV/view?usp=sharing > >> > >> File should be downloadable anonymously. > > > > Got it, thanks. > > > > There are 3 RPC replies that are more than 1 second after the request. > > The replies are in frames 983, 1005, 1777 These roughly correspond to > > where you added the "Freeze" annotation (I didn't know you could do that!). > > > > 983: > > This is the (Start of the) reply to a READ Request in frame 980. > > It is a 128K read. The whole reply has arrived by frame 1004, 2ms > > later. > > The request (Frame 980) is followed 13ms later by a TCP retransmit > > of the request and the (20usec later) a TCP ACK from the server. > > > > The fact that the client needed to retransmit seems a little odd > > but as it is the only retransmit in the whole capture, I don't think > > we can read much into it. > > > > 1005: > > This is the reply to a 128K READ request in frame 793 - earlier than > > previous one. > > So there were two READ requests, then a 2 second delay then both > > replies in reverse order. > > > > 1777: > > This is a similar READ reply - to 1761. > > There were READs in 1760, 1761, and 1775 > > 1760 is replied to almost immediately > > 1761 gets a reply in 4 seconds (1777) > > 1775 never gets a reply (in the available packet capture). > > > > Looking at other delays ... most READs get a reply in under a millisec. > > There are about 20 where the reply is more than 1ms - the longest delay > > not already mentioned is 90ms with reply 1857. > > The pattern here is > > READ req (1) > > GETATTR req > > GETATTR reply > > READ req (2) > > READ reply (1) > > pause > > READ reply (2) > > > > I suspect this is the same problem occurring, but it isn't so > > noticeable. > > > > My first thought was that the reply might be getting stuck in the TCP > > transmit queue on the server, but checking the TSval in the TCP > > timestamp option shows that - for frame 983 which shows a 2second delay > > - the TSval is also 2seconds later than the previous packet. So the > > delay happens before the TCP-level decision to create the packet. > > > > So I cannot see any real evidence to suggest a TCP-level problem. > > The time of 2 or 4 seconds - and maybe even 90ms - seem unlikely to be > > caused by an NFSd problem. > > > > So my guess is that the delay comes from the filesystem. Maybe. > > What filesystem are you exporting? > > > > How can we check this? Probably by turning on nfsd tracing. > > There are a bunch of tracepoints that related to reading: > > > > trace_nfsd_read_start > > trace_nfsd_read_done > > trace_nfsd_read_io_done > > trace_nfsd_read_err > > trace_nfsd_read_splice > > trace_nfsd_read_vector > > trace_nfsd_read_start > > trace_nfsd_read_done > > > > Maybe enabling them might be useful as you should be able to see if the > > delay was within one read request, or between two read requests. > > But I don't have much (if any) experience in enabling trace points. I > > really should try that some day. Maybe you can find guidance on using > > these tracepoint somewhere ... or maybe you already know how :-) > > Try something like: > > # trace-cmd record -e nfsd:nfsd_read_\* > > When the test is done, ^C the trace-cmd program. The trace records > are stored in a file called trace.dat. You can view them with: > > # trace-cmd report | less > > The trace.dat file is portable. It carries the format specifiers > for all records events, so the trace records can be viewed on > other systems. > > > -- > Chuck Lever > > >