> On Aug 15, 2021, at 9:20 PM, NeilBrown <neilb@xxxxxxx> wrote: > > On Sun, 15 Aug 2021, Mike Javorski wrote: >> I managed to get a cap with several discreet freezes in it, and I >> included a chunk with 5 of them in the span of ~3000 frames. I added >> packet comments at each frame that the tshark command reported as > 1 >> sec RPC wait. Just search for "Freeze" in (wire|t)shark in packet >> details. This is with kernel 5.13.10 provided by Arch (See >> https://github.com/archlinux/linux/compare/a37da2be8e6c85...v5.13.10-arch1 >> for diff vs mainline, nothing NFS/RPC related I can identify). >> >> I tried unsuccessfully to get any failures with the 5.12.15 kernel. >> >> https://drive.google.com/file/d/1T42iX9xCdF9Oe4f7JXsnWqD8oJPrpMqV/view?usp=sharing >> >> File should be downloadable anonymously. > > Got it, thanks. > > There are 3 RPC replies that are more than 1 second after the request. > The replies are in frames 983, 1005, 1777 These roughly correspond to > where you added the "Freeze" annotation (I didn't know you could do that!). > > 983: > This is the (Start of the) reply to a READ Request in frame 980. > It is a 128K read. The whole reply has arrived by frame 1004, 2ms > later. > The request (Frame 980) is followed 13ms later by a TCP retransmit > of the request and the (20usec later) a TCP ACK from the server. > > The fact that the client needed to retransmit seems a little odd > but as it is the only retransmit in the whole capture, I don't think > we can read much into it. > > 1005: > This is the reply to a 128K READ request in frame 793 - earlier than > previous one. > So there were two READ requests, then a 2 second delay then both > replies in reverse order. > > 1777: > This is a similar READ reply - to 1761. > There were READs in 1760, 1761, and 1775 > 1760 is replied to almost immediately > 1761 gets a reply in 4 seconds (1777) > 1775 never gets a reply (in the available packet capture). > > Looking at other delays ... most READs get a reply in under a millisec. > There are about 20 where the reply is more than 1ms - the longest delay > not already mentioned is 90ms with reply 1857. > The pattern here is > READ req (1) > GETATTR req > GETATTR reply > READ req (2) > READ reply (1) > pause > READ reply (2) > > I suspect this is the same problem occurring, but it isn't so > noticeable. > > My first thought was that the reply might be getting stuck in the TCP > transmit queue on the server, but checking the TSval in the TCP > timestamp option shows that - for frame 983 which shows a 2second delay > - the TSval is also 2seconds later than the previous packet. So the > delay happens before the TCP-level decision to create the packet. > > So I cannot see any real evidence to suggest a TCP-level problem. > The time of 2 or 4 seconds - and maybe even 90ms - seem unlikely to be > caused by an NFSd problem. > > So my guess is that the delay comes from the filesystem. Maybe. > What filesystem are you exporting? > > How can we check this? Probably by turning on nfsd tracing. > There are a bunch of tracepoints that related to reading: > > trace_nfsd_read_start > trace_nfsd_read_done > trace_nfsd_read_io_done > trace_nfsd_read_err > trace_nfsd_read_splice > trace_nfsd_read_vector > trace_nfsd_read_start > trace_nfsd_read_done > > Maybe enabling them might be useful as you should be able to see if the > delay was within one read request, or between two read requests. > But I don't have much (if any) experience in enabling trace points. I > really should try that some day. Maybe you can find guidance on using > these tracepoint somewhere ... or maybe you already know how :-) Try something like: # trace-cmd record -e nfsd:nfsd_read_\* When the test is done, ^C the trace-cmd program. The trace records are stored in a file called trace.dat. You can view them with: # trace-cmd report | less The trace.dat file is portable. It carries the format specifiers for all records events, so the trace records can be viewed on other systems. -- Chuck Lever