Re: non-stop kworker NFS/RPC write traffic even after unmount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 12/16/24 1:34 AM, Trond Myklebust wrote:
On Sun, 2024-12-15 at 13:38 +0100, Rik Theys wrote:
Hi,

We are experiencing an issue on our Rocky 9 NFS server and Rocky 8,
Rocky 9 and Fedora 41 clients.

The server is (now) running upstream Linux 6.11.11 and the Fedora 41
clients are running the Fedora 6.11.11 kernel. The Rocky 8 and 9
machines are running the latest Rocky 8/9 kernels.

Suddenly, a number of clients start to send an abnormal amount of NFS
traffic to the server that saturates their link and never seems to
stop.
Running iotop on the clients shows kworker-{rpciod,nfsiod,xprtiod}
processes generating the write traffic. On the server side, the
system
seems to process the traffic as the disks are processing the write
requests.

This behavior continues even after stopping all user processes on the
clients and unmounting the NFS mount on the client. Is this normal? I
was under the impression that once the NFS mount is unmounted no
further
traffic to the server should be visible?

Not all clients seem to trigger this issue. On a Fedora 41 client
that
(auto)mounts home directories from the NFS server the behavior seems
to
be triggered when I start Thunderbird and let it process a lot of new
mail (mail from the IMAP server is stored in the thunderbird cache
that's stored in the nfs-mounted home directory). This triggers the
high
write traffic of the kworker threads. At first, thunderbird behaves
normally but gets really slow over time. Stopping thunderbird does
not
stop the kworker threads and they keep sending a lot of traffic to
the
server.

Can you point me to some steps to further diagnose this? Where can I
find what triggers the creation of these kworker threads? Why does
iotop
show the write traffic with these threads, and not the thunderbird
threads?

There haven't been many changes to our kernels on the Rocky side
recently. Is it possible a Fedora 41 client running a more recent
kernel
somehow triggers a behavior on the server that results in Rocky
clients
to start to misbehave?

Which operations are the clients sending to the server? Ideally you'll
want to look at a wireshark trace to see what is being send on the
wire, but it might be sufficient to watch the 'nfsstat' output on both
the clients and server to see what is anomalous or different about the
traffic when the issue is occurring.

When I look at the captures I've taken, wireshark doesn't seem to recognize the traffic as NFS or RPC. I've attached a small portion of a capture. I will try to find out how to force wireshark to identify the traffic as NFS.

The traffic seems to be generated by the kworker threads I've mentioned above. Is it normal for a client to keep sending traffic using these threads to an NFS server even if there isn't an active mount anymore? Is the callback channel kept open between the client and server if a mount is unmounted?

Regards,

Rik

--
Rik Theys
System Engineer
KU Leuven - Dept. Elektrotechniek (ESAT)
Kasteelpark Arenberg 10 bus 2440  - B-3001 Leuven-Heverlee
+32(0)16/32.11.07
----------------------------------------------------------------
<<Any errors in spelling, tact or fact are transmission errors>>

Attachment: partial.pcap
Description: application/vnd.tcpdump.pcap


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux