> On Apr 19, 2024, at 11:15 AM, Stephen Hemminger <stephen@xxxxxxxxxxxxxxxxxx> wrote: > > I forward networking bugs to the maintainers. > Netdev does not use bugzilla, not sure if NFS does. > > Begin forwarded message: > > Date: Thu, 18 Apr 2024 00:00:22 +0000 > From: bugzilla-daemon@xxxxxxxxxx > To: stephen@xxxxxxxxxxxxxxxxxx > Subject: [Bug 218743] New: NFS-RDMA-Connected Regression Found on Upstream Linux 6.9-rc1 > > > https://bugzilla.kernel.org/show_bug.cgi?id=218743 > > Bug ID: 218743 > Summary: NFS-RDMA-Connected Regression Found on Upstream Linux > 6.9-rc1 > Product: Networking > Version: 2.5 > Kernel Version: 6.9-rc1 > Hardware: Intel > OS: Linux > Status: NEW > Severity: high > Priority: P3 > Component: Other > Assignee: stephen@xxxxxxxxxxxxxxxxxx > Reporter: manuel.gomez@xxxxxxxxxxxxxxxxxxxx > CC: dennis.dalessandro@xxxxxxxxxxxxxxxxxxxx > Regression: Yes > Bisected e084ee673c77cade06ab4c2e36b5624c82608b8c > commit-id: > > On the Linux 6.9-rc1 kernel there is a performance regression for NFS file > transfers when Connected IPoIB mode is enabled. The network switch is OPA > (Omnipath Architecture). > > The most recent good commit in my bisection was the v6.8 mainline kernel > (e8f897f4afef0031fe618a8e94127a0934896aba). Bisecting from v6.8 to v6.9-rc1 > showed me that "[e084ee673c77cade06ab4c2e36b5624c82608b8c] svcrdma: Add Write > chunk WRs to the RPC's Send WR chain" was indeed the culprit of the regression. > > > Here are the steps I ran to reproduce the issue: > 1. Establish IPoIB Connected Mode on both client and host nodes: > "echo connected > /sys/class/net/ibs785/mode" > > > 2. Start an NFS server on the host node: > "systemctl start opafm > sleep 10 > systemctl start nfs-server > modprobe svcrdma > echo "rdma 20049" > /proc/fs/nfsd/portlist > mkdir -p /mnt/nfs_test > mount -t tmpfs -o size=4096M tmpfs /mnt/nfs_test > sudo exportfs -o fsid=0,rw,async,insecure,no_root_squash > 192.168.2.0/255.255.255.0:/mnt/nfs_test_testrun/" > > > 3. Ready the client node: > "mkdir -p /mnt/nfs_test > mount -o rdma,port=20049 192.168.2.1:/mnt/nfs_test_testrun > /mnt/nfs_test_testrun/" > > > 4. Run the actual test from the client node: > " > #!/bin/bash > > fsize=268435456 > jfile=/dev/shm/run_nfs_test2.junk > tfile=/dev/shm/run_nfs_test2.tmp > nfsfile=/mnt/nfs_test_testrun/run_nfs_test2.junk > rm -r -f /mnt/nfs_test_testrun/ > rm -f ${tfile} > rm -f ${jfile} > > dd if=/dev/urandom iflag=fullblock of=${jfile} bs=1024 count=$((fsize/1024)); > > for i in {1..100}; do > cp ${jfile} ${nfsfile}; # Bottleneck 1 > > cp ${nfsfile} ${tfile}; # Bottleneck 2 > > cmp ${jfile} ${tfile}; > > rm -f ${tfile}; > echo "DONE with iter ${i}" > done; > > rm -f ${jfile}; > rm -f ${tfile}; > echo "Done"; > " > > > On v6.8 I was seeing this test taking about 1m50s to complete, for all 10 > iterations. On v6.9-rc1 it takes 3-7 minutes, and I also see these kernel > traces printed continuously in dmesg during this regression: > > [23720.243905] INFO: task kworker/61:1:556 blocked for more than 122 seconds. > [23720.251709] Not tainted 6.9.0-rc1 #1 > [23720.256387] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > message. > [23720.265268] task:kworker/61:1 state:D stack:0 pid:556 tgid:556 > ppid:2 flags:0x00004000 > [23720.275822] Workqueue: events __svc_rdma_free [rpcrdma] > [23720.281803] Call Trace: > [23720.284630] <TASK> > [23720.287067] __schedule+0x210/0x660 > [23720.291063] schedule+0x2c/0xb0 > [23720.294668] schedule_timeout+0x146/0x160 > [23720.299249] __wait_for_common+0x92/0x1d0 > [23720.303828] ? __pfx_schedule_timeout+0x10/0x10 > [23720.308987] __ib_drain_sq+0xfa/0x170 [ib_core] > [23720.314190] ? __pfx_ib_drain_qp_done+0x10/0x10 [ib_core] > [23720.320343] ib_drain_qp+0x71/0x80 [ib_core] > [23720.325232] __svc_rdma_free+0x28/0x100 [rpcrdma] > [23720.330604] process_one_work+0x196/0x3d0 > [23720.335185] worker_thread+0x2fc/0x410 > [23720.339470] ? __pfx_worker_thread+0x10/0x10 > [23720.344336] kthread+0xdf/0x110 > [23720.347941] ? __pfx_kthread+0x10/0x10 > [23720.352225] ret_from_fork+0x30/0x50 > [23720.356317] ? __pfx_kthread+0x10/0x10 > [23720.360602] ret_from_fork_asm+0x1a/0x30 > [23720.365083] </TASK> > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are the assignee for the bug. > Thanks, I've seen a performance regression on one system and haven't been able to reproduce it elsewhere. Please move this bug to Filesystems/NFS. -- Chuck Lever