I never tried, but I removed async and noatime and started a test run on the VM now. The result will take some time, as written below. > Gesendet: Sonntag, den 24.03.2024 um 23:13 Uhr > Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx> > An: "Jan Schunk" <scpcom@xxxxxx> > Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx> > Betreff: Re: [External] : nfsd: memory leak when client does many file operations > > > > > On Mar 24, 2024, at 5:39 PM, Jan Schunk <scpcom@xxxxxx> wrote: > > > > Yes, the VM is x86_64. > > > > "pgrep -c nfsd" says: 9 > > > > I use NFS version 3. > > > > All network ports are connected with 1GBit/s. > > > > The exported file system is ext4. > > > > I do not use any authentication. > > > > The mount options in /etc/fstab are: > > rw,noatime,nfsvers=3,proto=tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,noauto > > > > The line in /etc/exports: > > /export/data3 192.168.0.0/16(fsid=<uuid>,rw,no_root_squash,async,no_subtree_check) > > Is it possible to reproduce this issue without the "noatime" > mount option and without the "async" export option? > > > >> Gesendet: Sonntag, den 24.03.2024 um 22:10 Uhr > >> Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx> > >> An: "Jan Schunk" <scpcom@xxxxxx> > >> Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx> > >> Betreff: Re: [External] : nfsd: memory leak when client does many file operations > >> > >> > >>> On Mar 24, 2024, at 4:48 PM, Jan Schunk <scpcom@xxxxxx> wrote: > >>> > >>> The "heavy usage" is a simple script runinng on the client and does the following: > >>> 1. Create a empty git repository on the share > >>> 2. Unpacking a tar.gz archive (Qnap GPL source code) > >>> 3. Remove some folders/files > >>> 4. Use diff to compare it with an older version > >>> 5. commit them to the git > >>> 6. Repeat at step 2 with next archive > >>> > >>> On my armhf NAS the other memory consuming workload is an SMB server. > >> > >> I'm not sure any of us has a Freescale system to try this ... > >> > >> > >>> On the test VM the other memory consuming workload is a GNOME desktop. > >> > >> ... and so I'm hoping this VM is an x86_64 system. > >> > >> > >>> But it does not make much difference if I stop other services it just takes a bit longer until the same issue happens. > >>> The size of swap also does not make a difference. > >> > >> What is the nfsd thread count on the server? 'pgrep -c nfsd' > >> > >> What version of NFS does your client mount with? > >> > >> What is the speed of the network between your client and server? > >> > >> What is the type of the exported file system? > >> > >> Do you use NFS with Kerberos? > >> > >> > >>>> Gesendet: Sonntag, den 24.03.2024 um 21:14 Uhr > >>>> Von: "Chuck Lever III" <chuck.lever@xxxxxxxxxx> > >>>> An: "Jan Schunk" <scpcom@xxxxxx> > >>>> Cc: "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx> > >>>> Betreff: Re: [External] : nfsd: memory leak when client does many file operations > >>>> > >>>> > >>>> > >>>>> On Mar 24, 2024, at 3:57 PM, Jan Schunk <scpcom@xxxxxx> wrote: > >>>>> > >>>>> Issue found on: v6.5.13 v6.6.13, v6.6.14, v6.6.20 and v6.8.1 > >>>>> Not found on: v6.4, v6.1.82 and below > >>>>> Architectures: amd64 and arm(hf) > >>>>> > >>>>> Steps to reproduce: > >>>>> - Create a VM with 1GB RAM > >>>>> - Install Debian 12 > >>>>> - Install linux-image-6.6.13+bpo-amd64-unsigned and nfs-kernel-server > >>>>> - Export some folder > >>>>> On the client: > >>>>> - Mount the share > >>>>> - Run a script that does produce heavy usage on the share (like unpacking large tar archives that cointain many small files into a git and commiting them) > >>>> > >>>> Hi Jan, thanks for the report. > >>>> > >>>> The "produce heavy usage" instruction here is pretty vague. > >>>> I run CI testing with kmemleak enabled, and have not seen > >>>> any leaks on recent kernels when running the git regression > >>>> tests, which are similar to this kind of workload. > >>>> > >>>> Can you try to narrow the reproducer for us, even just a > >>>> little? What client action exactly is triggering the memory > >>>> leak? Is there any other workload on your NFS server that > >>>> might be consuming memory? > >>>> > >>>> > >>>>> On my setup it takes 20-40 hours until the memory is full and oom-kill gets hired by nfsd to kill other processes. the memory stays full and the system reboots: > >>>>> > >>>>> [121969.590000] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,task=dbus-daemon,pid=454,uid=101 > >>>>> [121969.600000] Out of memory: Killed process 454 (dbus-daemon) total-vm:6196kB, anon-rss:128kB, file-rss:1408kB, shmem-rss:0kB, UID:101 pgtables:12kB oom_score_adj:-900 > >>>>> [121971.700000] oom_reaper: reaped process 454 (dbus-daemon), now anon-rss:0kB, file-rss:64kB, shmem-rss:0kB > >>>>> [121971.920000] nfsd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0 > >>>>> [121971.930000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx > >>>>> [121971.930000] Hardware name: Freescale LS1024A > >>>>> [121971.940000] unwind_backtrace from show_stack+0xb/0xc > >>>>> [121971.940000] show_stack from dump_stack_lvl+0x2b/0x34 > >>>>> [121971.950000] dump_stack_lvl from dump_header+0x35/0x212 > >>>>> [121971.950000] dump_header from out_of_memory+0x317/0x34c > >>>>> [121971.960000] out_of_memory from __alloc_pages+0x8e7/0xbb0 > >>>>> [121971.970000] __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8 > >>>>> [121971.970000] __alloc_pages_bulk from svc_recv+0x9d/0x7d4 > >>>>> [121971.980000] svc_recv from nfsd+0x7d/0xd4 > >>>>> [121971.980000] nfsd from kthread+0xb9/0xcc > >>>>> [121971.990000] kthread from ret_from_fork+0x11/0x1c > >>>>> [121971.990000] Exception stack(0xc2cadfb0 to 0xc2cadff8) > >>>>> [121971.990000] dfa0: 00000000 00000000 00000000 00000000 > >>>>> [121972.000000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > >>>>> [121972.010000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 > >>>>> [121972.020000] Mem-Info: > >>>>> [121972.020000] active_anon:101 inactive_anon:127 isolated_anon:29 > >>>>> [121972.020000] active_file:1200 inactive_file:1204 isolated_file:98 > >>>>> [121972.020000] unevictable:394 dirty:296 writeback:17 > >>>>> [121972.020000] slab_reclaimable:13680 slab_unreclaimable:4350 > >>>>> [121972.020000] mapped:637 shmem:4 pagetables:414 > >>>>> [121972.020000] sec_pagetables:0 bounce:0 > >>>>> [121972.020000] kernel_misc_reclaimable:0 > >>>>> [121972.020000] free:7279 free_pcp:184 free_cma:1094 > >>>>> [121972.060000] Node 0 active_anon:404kB inactive_anon:508kB active_file:4736kB inactive_file:4884kB unevictable:1576kB isolated(anon):116kB isolated(file):388kB mapped:2548kB dirty:1184kB writeback:68kB shmem:16kB writeback_tmp:0kB kernel_stack:1088kB pagetables:1656kB sec_pagetables:0kB all_unreclaimable? no > >>>>> [121972.090000] Normal free:29116kB boost:18432kB min:26624kB low:28672kB high:30720kB reserved_highatomic:0KB active_anon:404kB inactive_anon:712kB active_file:4788kB inactive_file:4752kB unevictable:1576kB writepending:1252kB present:1048576kB managed:1011988kB mlocked:1576kB bounce:0kB free_pcp:736kB local_pcp:236kB free_cma:4376kB > >>>>> [121972.120000] lowmem_reserve[]: 0 0 > >>>>> [121972.120000] Normal: 2137*4kB (UEC) 1173*8kB (UEC) 529*16kB (UEC) 19*32kB (UC) 7*64kB (C) 5*128kB (C) 2*256kB (C) 1*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 29116kB > >>>>> [121972.140000] 2991 total pagecache pages > >>>>> [121972.140000] 166 pages in swap cache > >>>>> [121972.140000] Free swap = 93424kB > >>>>> [121972.150000] Total swap = 102396kB > >>>>> [121972.150000] 262144 pages RAM > >>>>> [121972.150000] 0 pages HighMem/MovableOnly > >>>>> [121972.160000] 9147 pages reserved > >>>>> [121972.160000] 4096 pages cma reserved > >>>>> [121972.160000] Unreclaimable slab info: > >>>>> [121972.170000] Name Used Total > >>>>> [121972.170000] bio-88 64KB 64KB > >>>>> [121972.180000] TCPv6 61KB 61KB > >>>>> [121972.180000] bio-76 16KB 16KB > >>>>> [121972.190000] bio-188 11KB 11KB > >>>>> [121972.190000] nfs_read_data 22KB 22KB > >>>>> [121972.200000] kioctx 15KB 15KB > >>>>> [121972.200000] posix_timers_cache 7KB 7KB > >>>>> [121972.210000] UDP 63KB 63KB > >>>>> [121972.220000] tw_sock_TCP 3KB 3KB > >>>>> [121972.220000] request_sock_TCP 3KB 3KB > >>>>> [121972.230000] TCP 62KB 62KB > >>>>> [121972.230000] bio-168 7KB 7KB > >>>>> [121972.240000] ep_head 8KB 8KB > >>>>> [121972.240000] request_queue 15KB 15KB > >>>>> [121972.250000] bio-124 18KB 40KB > >>>>> [121972.250000] biovec-max 264KB 264KB > >>>>> [121972.260000] biovec-128 63KB 63KB > >>>>> [121972.260000] biovec-64 157KB 157KB > >>>>> [121972.270000] skbuff_small_head 94KB 94KB > >>>>> [121972.270000] skbuff_fclone_cache 55KB 63KB > >>>>> [121972.280000] skbuff_head_cache 59KB 59KB > >>>>> [121972.280000] fsnotify_mark_connector 16KB 28KB > >>>>> [121972.290000] sigqueue 19KB 31KB > >>>>> [121972.300000] shmem_inode_cache 1622KB 1662KB > >>>>> [121972.300000] kernfs_iattrs_cache 15KB 15KB > >>>>> [121972.310000] kernfs_node_cache 2107KB 2138KB > >>>>> [121972.310000] filp 259KB 315KB > >>>>> [121972.320000] net_namespace 30KB 30KB > >>>>> [121972.320000] uts_namespace 15KB 15KB > >>>>> [121972.330000] vma_lock 143KB 179KB > >>>>> [121972.330000] vm_area_struct 459KB 553KB > >>>>> [121972.340000] sighand_cache 191KB 220KB > >>>>> [121972.340000] task_struct 378KB 446KB > >>>>> [121972.350000] anon_vma_chain 753KB 804KB > >>>>> [121972.360000] anon_vma 170KB 207KB > >>>>> [121972.360000] trace_event_file 83KB 83KB > >>>>> [121972.370000] mm_struct 157KB 173KB > >>>>> [121972.370000] vmap_area 217KB 354KB > >>>>> [121972.380000] kmalloc-8k 224KB 224KB > >>>>> [121972.380000] kmalloc-4k 860KB 992KB > >>>>> [121972.390000] kmalloc-2k 352KB 352KB > >>>>> [121972.390000] kmalloc-1k 563KB 576KB > >>>>> [121972.400000] kmalloc-512 936KB 936KB > >>>>> [121972.400000] kmalloc-256 196KB 240KB > >>>>> [121972.410000] kmalloc-192 160KB 169KB > >>>>> [121972.410000] kmalloc-128 546KB 764KB > >>>>> [121972.420000] kmalloc-64 1213KB 1288KB > >>>>> [121972.420000] kmem_cache_node 12KB 12KB > >>>>> [121972.430000] kmem_cache 16KB 16KB > >>>>> [121972.440000] Tasks state (memory values in pages): > >>>>> [121972.440000] [ pid ] uid tgid total_vm rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name > >>>>> [121972.450000] [ 209] 0 209 5140 320 0 320 0 16384 480 -1000 systemd-udevd > >>>>> [121972.460000] [ 230] 998 230 2887 55 32 23 0 18432 0 0 systemd-network > >>>>> [121972.470000] [ 420] 0 420 596 0 0 0 0 6144 22 0 mdadm > >>>>> [121972.490000] [ 421] 102 421 1393 56 32 24 0 10240 0 0 rpcbind > >>>>> [121972.500000] [ 429] 996 429 3695 17 0 17 0 20480 0 0 systemd-resolve > >>>>> [121972.510000] [ 433] 0 433 494 51 0 51 0 8192 0 0 rpc.idmapd > >>>>> [121972.520000] [ 434] 0 434 743 92 33 59 0 8192 7 0 nfsdcld > >>>>> [121972.530000] [ 451] 0 451 390 0 0 0 0 6144 0 0 acpid > >>>>> [121972.540000] [ 453] 105 453 1380 50 32 18 0 10240 18 0 avahi-daemon > >>>>> [121972.550000] [ 454] 101 454 1549 16 0 16 0 12288 32 -900 dbus-daemon > >>>>> [121972.560000] [ 466] 0 466 3771 60 0 60 0 14336 0 0 irqbalance > >>>>> [121972.570000] [ 475] 0 475 6269 32 32 0 0 18432 0 0 rsyslogd > >>>>> [121972.590000] [ 487] 105 487 1347 68 38 30 0 10240 0 0 avahi-daemon > >>>>> [121972.600000] [ 492] 0 492 1765 0 0 0 0 12288 0 0 cron > >>>>> [121972.610000] [ 493] 0 493 2593 0 0 0 0 16384 0 0 wpa_supplicant > >>>>> [121972.620000] [ 494] 0 494 607 0 0 0 0 8192 32 0 atd > >>>>> [121972.630000] [ 506] 0 506 1065 25 0 25 0 10240 0 0 rpc.mountd > >>>>> [121972.640000] [ 514] 103 514 809 25 0 25 0 8192 0 0 rpc.statd > >>>>> [121972.650000] [ 522] 0 522 999 31 0 31 0 10240 0 0 agetty > >>>>> [121972.660000] [ 524] 0 524 1540 28 0 28 0 12288 0 0 agetty > >>>>> [121972.670000] [ 525] 0 525 9098 56 32 24 0 34816 0 0 unattended-upgr > >>>>> [121972.690000] [ 526] 0 526 2621 320 0 320 0 14336 192 -1000 sshd > >>>>> [121972.700000] [ 539] 0 539 849 32 32 0 0 8192 0 0 in.tftpd > >>>>> [121972.710000] [ 544] 113 544 4361 6 6 0 0 16384 25 0 chronyd > >>>>> [121972.720000] [ 546] 0 546 16816 62 32 30 0 45056 0 0 winbindd > >>>>> [121972.730000] [ 552] 0 552 16905 59 32 27 0 45056 3 0 winbindd > >>>>> [121972.740000] [ 559] 0 559 17849 94 32 30 32 49152 4 0 smbd > >>>>> [121972.750000] [ 572] 0 572 17409 40 16 24 0 43008 11 0 smbd-notifyd > >>>>> [121972.760000] [ 573] 0 573 17412 16 16 0 0 43008 24 0 cleanupd > >>>>> [121972.770000] [ 584] 0 584 3036 20 0 20 0 16384 4 0 sshd > >>>>> [121972.780000] [ 589] 0 589 16816 32 2 30 0 40960 21 0 winbindd > >>>>> [121972.790000] [ 590] 0 590 27009 47 23 24 0 65536 21 0 smbd > >>>>> [121972.810000] [ 597] 501 597 3344 91 32 59 0 20480 0 100 systemd > >>>>> [121972.820000] [ 653] 501 653 3036 0 0 0 0 16384 33 0 sshd > >>>>> [121972.830000] [ 656] 501 656 1938 93 32 61 0 12288 9 0 bash > >>>>> [121972.840000] [ 704] 0 704 395 352 64 288 0 6144 0 -1000 watchdog > >>>>> [121972.850000] [ 738] 501 738 2834 12 0 12 0 16384 6 0 top > >>>>> [121972.860000] [ 4750] 0 4750 4218 44 26 18 0 18432 11 0 proftpd > >>>>> [121972.870000] [ 4768] 0 4768 401 31 0 31 0 6144 0 0 apt.systemd.dai > >>>>> [121972.880000] [ 4772] 0 4772 401 31 0 31 0 6144 0 0 apt.systemd.dai > >>>>> [121972.890000] [ 4778] 0 4778 13556 54 0 54 0 59392 26 0 apt-get > >>>>> [121972.900000] Out of memory and no killable processes... > >>>>> [121972.910000] Kernel panic - not syncing: System is deadlocked on memory > >>>>> [121972.920000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx > >>>>> [121972.920000] Hardware name: Freescale LS1024A > >>>>> [121972.930000] unwind_backtrace from show_stack+0xb/0xc > >>>>> [121972.930000] show_stack from dump_stack_lvl+0x2b/0x34 > >>>>> [121972.940000] dump_stack_lvl from panic+0xbf/0x264 > >>>>> [121972.940000] panic from out_of_memory+0x33f/0x34c > >>>>> [121972.950000] out_of_memory from __alloc_pages+0x8e7/0xbb0 > >>>>> [121972.950000] __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8 > >>>>> [121972.960000] __alloc_pages_bulk from svc_recv+0x9d/0x7d4 > >>>>> [121972.960000] svc_recv from nfsd+0x7d/0xd4 > >>>>> [121972.970000] nfsd from kthread+0xb9/0xcc > >>>>> [121972.970000] kthread from ret_from_fork+0x11/0x1c > >>>>> [121972.980000] Exception stack(0xc2cadfb0 to 0xc2cadff8) > >>>>> [121972.980000] dfa0: 00000000 00000000 00000000 00000000 > >>>>> [121972.990000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > >>>>> [121973.000000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 > >>>>> [121973.010000] CPU0: stopping > >>>>> [121973.010000] CPU: 0 PID: 540 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx > >>>>> [121973.010000] Hardware name: Freescale LS1024A > >>>>> [121973.010000] unwind_backtrace from show_stack+0xb/0xc > >>>>> [121973.010000] show_stack from dump_stack_lvl+0x2b/0x34 > >>>>> [121973.010000] dump_stack_lvl from do_handle_IPI+0x151/0x178 > >>>>> [121973.010000] do_handle_IPI from ipi_handler+0x13/0x18 > >>>>> [121973.010000] ipi_handler from handle_percpu_devid_irq+0x55/0x144 > >>>>> [121973.010000] handle_percpu_devid_irq from generic_handle_domain_irq+0x17/0x20 > >>>>> [121973.010000] generic_handle_domain_irq from gic_handle_irq+0x5f/0x70 > >>>>> [121973.010000] gic_handle_irq from generic_handle_arch_irq+0x27/0x34 > >>>>> [121973.010000] generic_handle_arch_irq from call_with_stack+0xd/0x10 > >>>>> [121973.010000] Rebooting in 90 seconds.. > >>>> > >>>> -- > >>>> Chuck Lever > >>>> > >>>> > >> > >> -- > >> Chuck Lever > >> > >> > > -- > Chuck Lever > >