On an older system I've been running Gentoo's revdep-rebuild to check for system linking/*.la consistency and after doing most of the work the system starved more or less, just complaining about stuck tasks now and then. Memory usage graph as seen from userspace showed sudden quick increase of memory usage though only a very few MB were swapped out (c.f. attached RRD graph). Sample soft-lockup trace: [ 3683.000651] BUG: soft lockup - CPU#0 stuck for 64s! [kswapd0:365] [ 3683.000720] Modules linked in: squashfs zlib_inflate nfs lockd nfs_acl sunrpc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer pcspkr snd snd_page_alloc [ 3683.001331] Modules linked in: squashfs zlib_inflate nfs lockd nfs_acl sunrpc snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm snd_timer pcspkr snd snd_page_alloc [ 3683.001941] [ 3683.001988] Pid: 365, comm: kswapd0 Tainted: G W 2.6.39-rc4-jupiter-00187-g686c4cb #1 NVIDIA Corporation. nFORCE-MCP/MS-6373 [ 3683.002163] EIP: 0060:[<c1092cd0>] EFLAGS: 00000246 CPU: 0 [ 3683.002221] EIP is at shrink_inactive_list+0x110/0x340 [ 3683.002272] EAX: 00000000 EBX: c1546620 ECX: 00000001 EDX: 00000000 [ 3683.002324] ESI: dd587f78 EDI: dd587e68 EBP: dd587e88 ESP: dd587e38 [ 3683.002376] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 3683.002427] Process kswapd0 (pid: 365, ti=dd586000 task=dd532700 task.ti=dd586000) [ 3683.002486] Stack: [ 3683.002532] dd587e78 00000001 00000000 00000001 00000054 dd532700 c15468d8 c154688c [ 3683.002871] c15468dc 00000000 dd587e5c 00000000 dd587e68 dd587e68 dd587e6c 0000007b [ 3683.003212] 00000000 00000002 00000000 00000000 dd587f14 c10932a0 00000002 00000001 [ 3683.003548] Call Trace: [ 3683.003602] [<c10932a0>] shrink_zone+0x3a0/0x500 [ 3683.003653] [<c1093b4e>] kswapd+0x74e/0x8d0 [ 3683.003705] [<c1093400>] ? shrink_zone+0x500/0x500 [ 3683.003757] [<c1047794>] kthread+0x74/0x80 [ 3683.003807] [<c1047720>] ? kthreadd+0xb0/0xb0 [ 3683.003864] [<c13918b6>] kernel_thread_helper+0x6/0xd [ 3683.003913] Code: 0e 04 74 5f 89 da 2b 93 e0 02 00 00 c1 fa 02 69 d2 95 fa 53 ea 01 04 95 e8 55 52 c1 8b 55 d4 85 d2 75 5f fb c7 45 dc 00 00 00 00 <8b> 45 dc 83 c4 44 5b 5e 5f c9 c3 90 8d 74 26 00 83 7d 08 09 0f [ 3683.006264] Call Trace: [ 3683.006312] [<c10932a0>] shrink_zone+0x3a0/0x500 [ 3683.006379] [<c1093b4e>] kswapd+0x74e/0x8d0 [ 3683.006430] [<c1093400>] ? shrink_zone+0x500/0x500 [ 3683.006480] [<c1047794>] kthread+0x74/0x80 [ 3683.006529] [<c1047720>] ? kthreadd+0xb0/0xb0 [ 3683.006580] [<c13918b6>] kernel_thread_helper+0x6/0xd A meminfo (sysreq + m) in between the traces produced the following weird looking information: [ 6605.162375] Mem-Info: [ 6605.162421] DMA per-cpu: [ 6605.162468] CPU 0: hi: 0, btch: 1 usd: 0 [ 6605.162517] Normal per-cpu: [ 6605.162565] CPU 0: hi: 186, btch: 31 usd: 56 [ 6605.162618] active_anon:56 inactive_anon:86 isolated_anon:32 [ 6605.162620] active_file:1114 inactive_file:1101 isolated_file:32 [ 6605.162623] unevictable:8 dirty:0 writeback:0 unstable:0 [ 6605.162624] free:1195 slab_reclaimable:14698 slab_unreclaimable:34339 [ 6605.162626] mapped:135 shmem:0 pagetables:87 bounce:0 [ 6605.162875] DMA free:1936kB min:88kB low:108kB high:132kB active_anon:4kB inactive_anon:4294967292kB active_file:4294967284kB inac tive_file:4294967252kB unevictable:0kB isolated(anon):4kB isolated(file):56kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB map ped:0kB shmem:0kB slab_reclaimable:180kB slab_unreclaimable:4688kB kernel_stack:9104kB pagetables:0kB unstable:0kB bounce:0kB writeba ck_tmp:0kB pages_scanned:2 all_unreclaimable? no [ 6605.163011] lowmem_reserve[]: 0 460 460 [ 6605.163207] Normal free:2836kB min:2696kB low:3368kB high:4044kB active_anon:220kB inactive_anon:348kB active_file:4468kB inactive _file:4448kB unevictable:32kB isolated(anon):124kB isolated(file):72kB present:471360kB mlocked:32kB dirty:0kB writeback:0kB mapped:5 40kB shmem:0kB slab_reclaimable:58612kB slab_unreclaimable:132676kB kernel_stack:256640kB pagetables:348kB unstable:0kB bounce:0kB wr iteback_tmp:0kB pages_scanned:871 all_unreclaimable? no [ 6605.163337] lowmem_reserve[]: 0 0 0 [ 6605.163526] DMA: 96*4kB 178*8kB 8*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1936kB [ 6605.164012] Normal: 539*4kB 1*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2836kB [ 6605.164500] 2360 total pagecache pages [ 6605.164547] 113 pages in swap cache [ 6605.164595] Swap cache stats: add 135546, delete 135433, find 121976/158215 [ 6605.164647] Free swap = 518472kB [ 6605.164694] Total swap = 524284kB [ 6605.170001] 122848 pages RAM [ 6605.170001] 2699 pages reserved [ 6605.170001] 2129 pages shared [ 6605.170001] 83744 pages non-shared dmesg output (stripped part of the soft-lockup traces to keep log small) and config attached. That state suddenly "fixed" itself when revdep-rebuild ended up in STOPPED state (lots of minutes after hitting CTRL+Z on its terminal), though system didn't survive very long after that (died in network stack handling RX IRQ for IPv6 packet. System has mostly reiserfs filesystems mounted (one ext3 but that one should not have been touched). Previously I've been running kernels up to 2.6.38 without seeing this kind of issue. This was first run of 2.6.39-rc kernel on this system. If there is some more info I can provide, please tell! Bruno
Attachment:
vm-madness.config
Description: Binary data
Attachment:
vm-madness.dmesg
Description: Binary data
Attachment:
memory-exhausted.png
Description: PNG image