On Fri, 2022-07-22 at 00:30 +0800, Zorro Lang wrote: > On Thu, Jul 21, 2022 at 03:32:49PM +0000, Chuck Lever III wrote: > > > > > > > On Jul 21, 2022, at 11:29 AM, Zorro Lang <zlang@xxxxxxxxxx> wrote: > > > > > > On Thu, Jul 21, 2022 at 10:29:59AM +0800, bxue@xxxxxxxxxx wrote: > > > > From: Boyang Xue <bxue@xxxxxxxxxx> > > > > > > > > The test requires larger scratch dev size when running on top of NFS other > > > > than ext4 and xfs. It requires at least 27GB in my test. Without this > > > > requirement, the test run never finishes on NFS, leaving 100% scratch disk > > > > space use. > > > > > > > > Signed-off-by: Boyang Xue <bxue@xxxxxxxxxx> > > > > --- > > > > Hi, > > > > > > > > I find generic/476 easily goes into an infinite run on top of NFS. When it > > > > happens, the common pattern is 100% disk space use of SCRATCH_MNT, and > > > > `nfsiostat` shows 50% write error on SCRATCH_MNT. When I run it with a large > > > > enough SCRATCH_MNT, the problem disappears. So I post this patch to add the size > > > > requirement. > > > > > > > > Please help review this patch. Thanks! > > > > > > > > -Boyang > > > > > > > > tests/generic/476 | 1 + > > > > 1 file changed, 1 insertion(+) > > > > > > > > diff --git a/tests/generic/476 b/tests/generic/476 > > > > index 212373d1..dcc7c3da 100755 > > > > --- a/tests/generic/476 > > > > +++ b/tests/generic/476 > > > > @@ -24,6 +24,7 @@ _cleanup() > > > > _supported_fs generic > > > > > > > > _require_scratch > > > > +_require_scratch_size $((27 * 1024 * 1024)) # 27GB > > > > > > At first, most of other filesystems don't need this requirement, that will > > > reduce test coverage of other fs suddently. Second, there's not a clear > > > reason to prove NFS (or others) need a 27+GB device to run this test. Due to > > > the generic/476 does nothing special, except random I/Os, even running with > > > ENOSPC... > > > > > > So one difference of running with small device or large enough device is the > > > chance to run with ENOSPC. I think the device size isn't the root cause of > > > nfs hang you hit, I doubt if it's a NFS bug with ENOSPC, or something else > > > bug which is triggered by someone random I/O operation. > > > > > > We'd better not to skip a known generic test (from upstream fstests directly) > > > if without a clear reason. That might cause we miss bugs or test coverage, > > > better to make sure if it's a real bug at first. Then think about if we need > > > to improve the case or fix a bug. > > > > +1 > > > > I can help troubleshoot the NFS-related aspects of this further, if needed. > > Thanks Chuck, I think we'd better to cc linux-nfs list. > > I can reproduce generic/476 hang on nfs[0] too, with real disk or Virtio disk, > it's not a loop device related issue. And sometimes it test passed[1], even with > a 10G loop device[2] (which I/O speed is very fast). If you or nfs forks need > full console log, I can provide. But I think it's easy to reproduce this bug > by running generic/476 on a general nfs v4.2. > > Thanks, > Zorro > > [0] > [173303.753434] sysrq: Show Memory > [173303.764679] Mem-Info: > [173303.765683] active_anon:1872 inactive_anon:22756 isolated_anon:0 > [173303.765683] active_file:47360 inactive_file:1230085 isolated_file:0 > [173303.765683] unevictable:0 dirty:3 writeback:0 > [173303.765683] slab_reclaimable:108117 slab_unreclaimable:100595 > [173303.765683] mapped:11734 shmem:7363 pagetables:544 bounce:0 > [173303.765683] kernel_misc_reclaimable:0 > [173303.765683] free:49561 free_pcp:5112 free_cma:0 > [173303.780542] Node 0 active_anon:7488kB inactive_anon:91024kB active_file:189440kB inactive_file:4920340kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:46936kB dirty:12kB writeback:0kB shmem:29452kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 18432kB writeback_tmp:0kB kernel_stack:5920kB pagetables:2176kB all_unreclaimable? no > [173303.790519] Node 0 DMA free:14336kB boost:0kB min:160kB low:200kB high:240kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB > [173303.799060] lowmem_reserve[]: 0 2659 6183 6183 6183 > [173303.800781] Node 0 DMA32 free:123700kB boost:0kB min:28992kB low:36240kB high:43488kB reserved_highatomic:0KB active_anon:12kB inactive_anon:1452kB active_file:61020kB inactive_file:2279284kB unevictable:0kB writepending:8kB present:3129308kB managed:2801628kB mlocked:0kB bounce:0kB free_pcp:3508kB local_pcp:768kB free_cma:0kB > [173303.810096] lowmem_reserve[]: 0 0 3524 3524 3524 > [173303.812131] Node 0 Normal free:60208kB boost:0kB min:38424kB low:48028kB high:57632kB reserved_highatomic:2048KB active_anon:7476kB inactive_anon:89572kB active_file:128420kB inactive_file:2641056kB unevictable:0kB writepending:4kB present:5242880kB managed:3618860kB mlocked:0kB bounce:0kB free_pcp:16240kB local_pcp:4196kB free_cma:0kB > [173303.822880] lowmem_reserve[]: 0 0 0 0 0 > [173303.824683] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 3*4096kB (M) = 14336kB > [173303.829410] Node 0 DMA32: 1385*4kB (UME) 1254*8kB (UME) 368*16kB (UME) 317*32kB (UME) 121*64kB (UM) 105*128kB (UME) 99*256kB (UM) 63*512kB (UME) 13*1024kB (UME) 0*2048kB 0*4096kB = 123700kB > [173303.836654] Node 0 Normal: 1056*4kB (UMEH) 1027*8kB (UMEH) 400*16kB (UMEH) 287*32kB (UMEH) 105*64kB (UMEH) 71*128kB (UM) 42*256kB (M) 10*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 59704kB > [173303.843268] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB > [173303.846232] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB > [173303.849541] 1284815 total pagecache pages > [173303.851713] 7 pages in swap cache > [173303.853033] Swap cache stats: add 24, delete 17, find 0/1 > [173303.874135] Free swap = 8257760kB > [173303.875679] Total swap = 8258556kB > [173303.877210] 2097045 pages RAM > [173303.878574] 0 pages HighMem/MovableOnly > [173303.880204] 488083 pages reserved > [173303.881721] 0 pages cma reserved > [173303.883181] 0 pages hwpoisoned > [173311.514201] sysrq: Show State > [173311.531486] task:systemd state:S stack:23208 pid: 1 ppid: 0 flags:0x00000002 > [173311.535523] Call Trace: > [173311.536988] <TASK> > [173311.537992] __schedule+0x673/0x1540 > [173311.539782] ? io_schedule_timeout+0x160/0x160 > [173311.541842] schedule+0xe0/0x200 > [173311.543466] schedule_hrtimeout_range_clock+0x2c0/0x310 > [173311.545632] ? hrtimer_nanosleep_restart+0x170/0x170 > [173311.547948] ? lock_downgrade+0x130/0x130 > [173311.549585] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 > [173311.551686] ep_poll+0x7db/0xa80 > [173311.552999] ? ep_send_events+0x9f0/0x9f0 > [173311.555015] ? pick_next_task_stop+0x250/0x250 > [173311.574734] ? prepare_to_wait_exclusive+0x2c0/0x2c0 > [173311.576518] ? __lock_release+0x4c1/0xa00 > [173311.579162] do_epoll_wait+0x12f/0x160 > [173311.580973] __x64_sys_epoll_wait+0x12e/0x250 > [173311.582599] ? __x64_sys_epoll_pwait2+0x240/0x240 > [173311.584769] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 > [173311.587348] do_syscall_64+0x5c/0x90 > [173311.589335] ? do_syscall_64+0x69/0x90 > [173311.591128] ? lockdep_hardirqs_on+0x79/0x100 > [173311.593125] ? do_syscall_64+0x69/0x90 > [173311.594805] ? lockdep_hardirqs_on+0x79/0x100 > [173311.596598] entry_SYSCALL_64_after_hwframe+0x63/0xcd > [173311.598376] RIP: 0033:0x7f0a5834e7de > [173311.599836] RSP: 002b:00007ffcee99d180 EFLAGS: 00000293 ORIG_RAX: 00000000000000e8 > [173311.603432] RAX: ffffffffffffffda RBX: 000056269d7b5b80 RCX: 00007f0a5834e7de > [173311.605924] RDX: 00000000000000a6 RSI: 000056269da4f0d0 RDI: 0000000000000004 > [173311.608922] RBP: 000056269d7b5d10 R08: 0000000000000000 R09: 000000000000000d > [173311.611243] R10: 00000000ffffffff R11: 0000000000000293 R12: 0000000000000226 > [173311.613741] R13: 000056269d7b5b80 R14: 00000000000000a6 R15: 0000000000000037 > [173311.626581] </TASK> > [173311.627500] task:kthreadd state:S stack:27704 pid: 2 ppid: 0 flags:0x00004000 > [173311.630191] Call Trace: > [173311.631013] <TASK> > [173311.631872] __schedule+0x673/0x1540 > [173311.633147] ? io_schedule_timeout+0x160/0x160 > [173311.634725] schedule+0xe0/0x200 > [173311.635884] kthreadd+0x9fb/0xd60 > [173311.637204] ? kthread_is_per_cpu+0xc0/0xc0 > [173311.638684] ret_from_fork+0x22/0x30 > [173311.640320] </TASK> > [173311.641273] task:rcu_gp state:I stack:29336 pid: 3 ppid: 2 flags:0x00004000 > [173311.645006] Workqueue: 0x0 (rcu_gp) > [173311.646684] Call Trace: > [173311.647756] <TASK> > [173311.648895] __schedule+0x673/0x1540 > [173311.650435] ? io_schedule_timeout+0x160/0x160 > [173311.652317] schedule+0xe0/0x200 > [173311.653732] rescuer_thread+0x662/0xb80 > [173311.656494] ? _raw_spin_unlock_irqrestore+0x59/0x70 > [173311.658779] ? worker_thread+0xed0/0xed0 > [173311.660381] ? __kthread_parkme+0xcc/0x200 > [173311.662276] ? worker_thread+0xed0/0xed0 > [173311.664182] kthread+0x2a7/0x350 > [173311.665725] ? kthread_complete_and_exit+0x20/0x20 > [173311.667519] ret_from_fork+0x22/0x30 > [173311.669222] </TASK> > [173311.670098] task:rcu_par_gp state:I stack:30888 pid: 4 ppid: 2 flags:0x00004000 > [173311.673101] Call Trace: > [173311.673976] <TASK> > [173311.674868] __schedule+0x673/0x1540 > [173311.676197] ? io_schedule_timeout+0x160/0x160 > [173311.677805] schedule+0xe0/0x200 > [173311.678991] rescuer_thread+0x662/0xb80 > [173311.680502] ? _raw_spin_unlock_irqrestore+0x59/0x70 > [173311.682222] ? worker_thread+0xed0/0xed0 > ... > (too many lines) > ... > [173319.787263] Showing all locks held in the system: > [173319.788884] no locks held by systemd-journal/517. > [173319.789881] 1 lock held by fsstress/663798: > [173319.790895] #0: ffff888130b68488 (sb_writers#16){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0 > [173319.792551] 3 locks held by kworker/u8:2/666396: > [173319.794275] #0: ffff8881e4602f58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x30/0x120 > [173319.795933] #1: ffff8881e4e02f58 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x30/0x120 > [173319.797605] #2: ffff8881009bc4f0 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb3/0xdd0 > [173319.799126] 3 locks held by 20_sysinfo/666628: > [173319.799995] #0: ffff88810c362488 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0 > [173319.801695] #1: ffffffff860e6dc0 (rcu_read_lock){....}-{1:2}, at: __handle_sysrq+0x4a/0xe0 > [173319.803276] #2: ffffffff860e6dc0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.0+0x0/0x30 > [173319.804988] > [173319.805319] ============================================= > [173319.805319] > [173319.806614] Showing busy workqueues and worker pools: > [173319.807587] workqueue events_highpri: flags=0x10 > [173319.808538] pwq 5: cpus=2 node=0 flags=0x0 nice=-20 active=1/256 refcnt=2 > [173319.808581] pending: mix_interrupt_randomness > [173328.255900] sysrq: Show Blocked State > [173328.276745] task:fsstress state:D stack:23416 pid:663798 ppid:663797 flags:0x00004002 > [173328.279713] Call Trace: > [173328.280646] <TASK> > [173328.281768] __schedule+0x673/0x1540 > [173328.283583] ? io_schedule_timeout+0x160/0x160 > [173328.285898] ? pick_next_task_stop+0x250/0x250 > [173328.288286] schedule+0xe0/0x200 > [173328.289932] schedule_timeout+0x19e/0x250 > [173328.291845] ? usleep_range_state+0x190/0x190 > [173328.293563] ? mark_held_locks+0xa5/0xf0 > [173328.295072] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 > [173328.297228] ? _raw_spin_unlock_irq+0x24/0x50 > [173328.299148] __wait_for_common+0x381/0x540 > [173328.301081] ? usleep_range_state+0x190/0x190 > [173328.303037] ? out_of_line_wait_on_bit_timeout+0x170/0x170 > [173328.304937] ? up_write+0x20/0x20 > [173328.306118] ? nfs_file_direct_write+0xaef/0x1050 [nfs] > [173328.308327] wait_for_completion_killable+0x20/0x40 > [173328.310146] nfs_file_direct_write+0xb2c/0x1050 [nfs] > [173328.312009] ? nfs_file_direct_read+0xaa0/0xaa0 [nfs] > [173328.313878] ? nfs_key_timeout_notify+0x3a/0x90 [nfs] > [173328.315651] ? __lock_acquire+0xb72/0x1870 > [173328.317032] ? nfs_file_write+0xed/0x8c0 [nfs] > [173328.318716] new_sync_write+0x2ef/0x540 > [173328.320059] ? new_sync_read+0x530/0x530 > [173328.321447] ? lock_acquire+0x1d8/0x620 > [173328.322839] ? rcu_read_unlock+0x40/0x40 > [173328.324309] vfs_write+0x62a/0x920 > [173328.325592] ksys_write+0xf9/0x1d0 > [173328.327621] ? __ia32_sys_read+0xa0/0xa0 > [173328.328996] ? ktime_get_coarse_real_ts64+0x130/0x170 > [173328.330746] do_syscall_64+0x5c/0x90 > [173328.331998] ? do_syscall_64+0x69/0x90 > [173328.333303] ? lockdep_hardirqs_on+0x79/0x100 > [173328.335143] ? do_syscall_64+0x69/0x90 > [173328.336483] ? do_syscall_64+0x69/0x90 > [173328.338093] ? lockdep_hardirqs_on+0x79/0x100 > [173328.340276] ? do_syscall_64+0x69/0x90 > [173328.341683] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 > [173328.343466] ? lockdep_hardirqs_on+0x79/0x100 > [173328.344989] entry_SYSCALL_64_after_hwframe+0x63/0xcd > [173328.346646] RIP: 0033:0x7f16be13e8c7 > [173328.347857] RSP: 002b:00007ffea1b02798 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [173328.350228] RAX: ffffffffffffffda RBX: 0000000000100000 RCX: 00007f16be13e8c7 > [173328.352485] RDX: 0000000000100000 RSI: 0000000001c00000 RDI: 0000000000000003 > [173328.354769] RBP: 0000000000000003 R08: 0000000001d00000 R09: 0000000000000000 > [173328.357318] R10: 0000000000000100 R11: 0000000000000246 R12: 0000000000017de1 > [173328.359664] R13: 0000000000000000 R14: 0000000001c00000 R15: 0000000000000000 > [173328.362074] </TASK> > Stopping > /usr/bin/bash -c …j; exec ./tests/generic/476 > ... > > [1] > [ 9746.410677] run fstests generic/476 at 2022-07-16 15:42:38 > [ 9803.017543] restraintd[2193]: *** Current Time: Sat Jul 16 15:43:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [ 9863.016020] restraintd[2193]: *** Current Time: Sat Jul 16 15:44:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [-- MARK -- Sat Jul 16 19:45:00 2022] > [ 9923.014409] restraintd[2193]: *** Current Time: Sat Jul 16 15:45:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [ 9983.016088] restraintd[2193]: *** Current Time: Sat Jul 16 15:46:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10043.014435] restraintd[2193]: *** Current Time: Sat Jul 16 15:47:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10103.017105] restraintd[2193]: *** Current Time: Sat Jul 16 15:48:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10163.057993] restraintd[2193]: *** Current Time: Sat Jul 16 15:49:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [-- MARK -- Sat Jul 16 19:50:00 2022] > [10223.018212] restraintd[2193]: *** Current Time: Sat Jul 16 15:50:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10283.015296] restraintd[2193]: *** Current Time: Sat Jul 16 15:51:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10343.014956] restraintd[2193]: *** Current Time: Sat Jul 16 15:52:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10403.014340] restraintd[2193]: *** Current Time: Sat Jul 16 15:53:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10463.016180] restraintd[2193]: *** Current Time: Sat Jul 16 15:54:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [-- MARK -- Sat Jul 16 19:55:00 2022] > [10523.017423] restraintd[2193]: *** Current Time: Sat Jul 16 15:55:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10583.015557] restraintd[2193]: *** Current Time: Sat Jul 16 15:56:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10643.014811] restraintd[2193]: *** Current Time: Sat Jul 16 15:57:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10703.009075] restraintd[2193]: *** Current Time: Sat Jul 16 15:58:33 2022 Localwatchdog at: Mon Jul 18 13:01:33 2022 > [10752.278563] run fstests generic/477 at 2022-07-16 15:59:23 > ... > > [2] > The 1st layer fs: xfs - SCRATCH_DEV=/dev/loop1, SCRATCH_MNT=/mnt/fstests/SCRATCH_DIR > meta-data=/dev/loop1 isize=512 agcount=4, agsize=655360 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=0 > = reflink=1 bigtime=1 inobtcount=1 nrext64=0 > data = bsize=4096 blocks=2621440, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > log =internal log bsize=4096 blocks=16384, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > Discarding blocks...Done. > The 2nd layer fs: nfs - SCRATCH_DEV=s390x-kvm-xxx.xxx.xxx.xxxx.redhat.com:/mnt/fstests/SCRATCH_DIR/nfs-server, SCRATCH_MNT=/mnt/fstests/SCRATCH_DIR/nfs-client > > > > Thanks Zorro. I'm having a look at this now: https://bugzilla.redhat.com/show_bug.cgi?id=2028370 I'm pretty sure this is a client error handling bug of some sort. It looks like the server is doing the right thing, but the client keeps trying to reissue the same write over and over. -- Jeff Layton <jlayton@xxxxxxxxxx>