Den mån 19 aug. 2019 9:53 fmMarkus Linnala <markus.linnala@xxxxxxxxx> skrev:
I've started to test 5.3-rc5 and generally there is about the same
issues as 5.3-rc4. I'll start testing with your patch righ away.
I do not expect any change in behavior with rc5. We can aim for rc6 though.
~Vitaly
ma 19. elok. 2019 klo 18.27 Vitaly Wool (vitalywool@xxxxxxxxx) kirjoitti:
>
> On Mon, Aug 19, 2019 at 4:42 PM Vitaly Wool <vitalywool@xxxxxxxxx> wrote:
> >
> > Hey Michal,
> >
> > On Mon, Aug 19, 2019 at 9:35 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > >
> > > Thanks a lot for a detailed bug report. CC Vitaly.
> >
> > thanks for CC'ing me.
> >
> > > The original email preserved for more context.
> >
> > Thanks Markus for bisecting. That really gave me the clue. I'll come
> > up with a patch within hours, would you be up for trying it?
>
> Patch: https://bugzilla.kernel.org/attachment.cgi?id=284507&action="">
>
> > Best regards,
> > Vitaly
> >
> > > On Sun 18-08-19 21:36:19, Markus Linnala wrote:
> > > > [1.] One line summary of the problem:
> > > >
> > > > zswap with z3fold makes swap stuck
> > > >
> > > >
> > > > [2.] Full description of the problem/report:
> > > >
> > > > I've enabled zwswap using kernel parameters: zswap.enabled=1 zswap.zpool=z3fold
> > > > When there is issue, every process using swapping is stuck.
> > > >
> > > > I can reproduce almost always in vanilla v5.3-rc4 running tool
> > > > "stress", repeatedly.
> > > >
> > > >
> > > > Issue starts with these messages:
> > > > [ 41.818966] BUG: unable to handle page fault for address: fffff54cf8000028
> > > > [ 14.458709] general protection fault: 0000 [#1] SMP PTI
> > > > [ 14.143173] kernel BUG at lib/list_debug.c:54!
> > > > [ 127.971860] kernel BUG at include/linux/mm.h:607!
> > > >
> > > >
> > > > [3.] Keywords (i.e., modules, networking, kernel):
> > > >
> > > > zswap z3fold swapping swap bisect
> > > >
> > > >
> > > > [4.] Kernel information
> > > >
> > > > [4.1.] Kernel version (from /proc/version):
> > > >
> > > > $ cat /proc/version
> > > > Linux version 5.3.0-rc4 (maage@xxxxxxxxxxxxxxx) (gcc version 9.1.1
> > > > 20190503 (Red Hat 9.1.1-1) (GCC)) #69 SMP Fri Aug 16 19:52:23 EEST
> > > > 2019
> > > >
> > > >
> > > > [4.2.] Kernel .config file:
> > > >
> > > > Attached as config-5.3.0-rc4
> > > >
> > > > My vanilla kernel config is based on Fedora kernel kernel config, but
> > > > most drivers not used in testing machine disabled to speed up test
> > > > builds.
> > > >
> > > >
> > > > [5.] Most recent kernel version which did not have the bug:
> > > >
> > > > I'm able to reproduce the issue in vanilla v5.3-rc4 and what ever came
> > > > as bad during git bisect from v5.1 (good) and v5.3-rc4 (bad). And I
> > > > can also reproduce issue with some Fedora kernels, at least from
> > > > 5.2.1-200.fc30.x86_64 on. About Fedora kernels:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1740690
> > > >
> > > > Result from git bisect:
> > > >
> > > > 7c2b8baa61fe578af905342938ad12f8dbaeae79 is the first bad commit
> > > >
> > > > commit 7c2b8baa61fe578af905342938ad12f8dbaeae79
> > > > Author: Vitaly Wool <vitalywool@xxxxxxxxx>
> > > > Date: Mon May 13 17:22:49 2019 -0700
> > > >
> > > > mm/z3fold.c: add structure for buddy handles
> > > >
> > > > For z3fold to be able to move its pages per request of the memory
> > > > subsystem, it should not use direct object addresses in handles. Instead,
> > > > it will create abstract handles (3 per page) which will contain pointers
> > > > to z3fold objects. Thus, it will be possible to change these pointers
> > > > when z3fold page is moved.
> > > >
> > > > Link: http://lkml.kernel.org/r/20190417103826.484eaf18c1294d682769880f@xxxxxxxxx
> > > > Signed-off-by: Vitaly Wool <vitaly.vul@xxxxxxxx>
> > > > Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@xxxxxxxxxxx>
> > > > Cc: Dan Streetman <ddstreet@xxxxxxxx>
> > > > Cc: Krzysztof Kozlowski <k.kozlowski@xxxxxxxxxxx>
> > > > Cc: Oleksiy Avramchenko <oleksiy.avramchenko@xxxxxxxxxxxxxx>
> > > > Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> > > > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > > > Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > > >
> > > > :040000 040000 1a27b311b3ad8556062e45fff84d46a57ba8a4b1
> > > > a79e463e14ab8ea271a89fb5f3069c3c84221478 M mm
> > > > bisect run success
> > > >
> > > >
> > > > [6.] Output of Oops.. message (if applicable) with symbolic information
> > > > resolved (see Documentation/admin-guide/bug-hunting.rst)
> > > >
> > > > 1st Full dmesg attached: dmesg-5.3.0-rc4-1566111932.476354086.txt
> > > >
> > > > [ 105.710330] BUG: unable to handle page fault for address: ffffd2df8a000028
> > > > [ 105.714547] #PF: supervisor read access in kernel mode
> > > > [ 105.717893] #PF: error_code(0x0000) - not-present page
> > > > [ 105.721227] PGD 0 P4D 0
> > > > [ 105.722884] Oops: 0000 [#1] SMP PTI
> > > > [ 105.725152] CPU: 0 PID: 1240 Comm: stress Not tainted 5.3.0-rc4 #69
> > > > [ 105.729219] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 105.734756] RIP: 0010:z3fold_zpool_map+0x52/0x110
> > > > [ 105.737801] Code: e8 48 01 ea 0f 82 ca 00 00 00 48 c7 c3 00 00 00
> > > > 80 48 2b 1d 70 eb e4 00 48 01 d3 48 c1 eb 0c 48 c1 e3 06 48 03 1d 4e
> > > > eb e4 00 <48> 8b 53 28 83 e2 01 74 07 5b 5d 41 5c 41 5d c3 4c 8d 6d 10
> > > > 4c 89
> > > > [ 105.749901] RSP: 0018:ffffa82d809a33f8 EFLAGS: 00010286
> > > > [ 105.753230] RAX: 0000000000000000 RBX: ffffd2df8a000000 RCX: 0000000000000000
> > > > [ 105.757754] RDX: 0000000080000000 RSI: ffff90edbab538d8 RDI: ffff90edb5fdd600
> > > > [ 105.762362] RBP: 0000000000000000 R08: ffff90edb5fdd600 R09: 0000000000000000
> > > > [ 105.766973] R10: 0000000000000003 R11: 0000000000000000 R12: ffff90edbab538d8
> > > > [ 105.771577] R13: ffff90edb5fdd6a0 R14: ffff90edb5fdd600 R15: ffffa82d809a3438
> > > > [ 105.776190] FS: 00007ff6a887b740(0000) GS:ffff90edbe400000(0000)
> > > > knlGS:0000000000000000
> > > > [ 105.780549] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 105.781436] CR2: ffffd2df8a000028 CR3: 0000000036fde006 CR4: 0000000000160ef0
> > > > [ 105.782365] Call Trace:
> > > > [ 105.782668] zswap_writeback_entry+0x50/0x410
> > > > [ 105.783199] z3fold_zpool_shrink+0x4a6/0x540
> > > > [ 105.783717] zswap_frontswap_store+0x424/0x7c1
> > > > [ 105.784329] __frontswap_store+0xc4/0x162
> > > > [ 105.784815] swap_writepage+0x39/0x70
> > > > [ 105.785282] pageout.isra.0+0x12c/0x5d0
> > > > [ 105.785730] shrink_page_list+0x1124/0x1830
> > > > [ 105.786335] shrink_inactive_list+0x1da/0x460
> > > > [ 105.786882] ? lruvec_lru_size+0x10/0x130
> > > > [ 105.787472] shrink_node_memcg+0x202/0x770
> > > > [ 105.788011] ? sched_clock_cpu+0xc/0xc0
> > > > [ 105.788594] shrink_node+0xdc/0x4a0
> > > > [ 105.789012] do_try_to_free_pages+0xdb/0x3c0
> > > > [ 105.789528] try_to_free_pages+0x112/0x2e0
> > > > [ 105.790009] __alloc_pages_slowpath+0x422/0x1000
> > > > [ 105.790547] ? __lock_acquire+0x247/0x1900
> > > > [ 105.791040] __alloc_pages_nodemask+0x37f/0x400
> > > > [ 105.791580] alloc_pages_vma+0x79/0x1e0
> > > > [ 105.792064] __read_swap_cache_async+0x1ec/0x3e0
> > > > [ 105.792639] swap_cluster_readahead+0x184/0x330
> > > > [ 105.793194] ? find_held_lock+0x32/0x90
> > > > [ 105.793681] swapin_readahead+0x2b4/0x4e0
> > > > [ 105.794182] ? sched_clock_cpu+0xc/0xc0
> > > > [ 105.794668] do_swap_page+0x3ac/0xc30
> > > > [ 105.795658] __handle_mm_fault+0x8dd/0x1900
> > > > [ 105.796729] handle_mm_fault+0x159/0x340
> > > > [ 105.797723] do_user_addr_fault+0x1fe/0x480
> > > > [ 105.798736] do_page_fault+0x31/0x210
> > > > [ 105.799700] page_fault+0x3e/0x50
> > > > [ 105.800597] RIP: 0033:0x56076f49e298
> > > > [ 105.801561] Code: 7e 01 00 00 89 df e8 47 e1 ff ff 44 8b 2d 84 4d
> > > > 00 00 4d 85 ff 7e 40 31 c0 eb 0f 0f 1f 80 00 00 00 00 4c 01 f0 49 39
> > > > c7 7e 2d <80> 7c 05 00 5a 4c 8d 54 05 00 74 ec 4c 89 14 24 45 85 ed 0f
> > > > 89 de
> > > > [ 105.804770] RSP: 002b:00007ffe5fc72e70 EFLAGS: 00010206
> > > > [ 105.805931] RAX: 00000000013ad000 RBX: ffffffffffffffff RCX: 00007ff6a8974156
> > > > [ 105.807300] RDX: 0000000000000000 RSI: 000000000b78d000 RDI: 0000000000000000
> > > > [ 105.808679] RBP: 00007ff69d0ee010 R08: 00007ff69d0ee010 R09: 0000000000000000
> > > > [ 105.810055] R10: 00007ff69e49a010 R11: 0000000000000246 R12: 000056076f4a0004
> > > > [ 105.811383] R13: 0000000000000002 R14: 0000000000001000 R15: 000000000b78cc00
> > > > [ 105.812713] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_balloon virtio_net
> > > > net_failover intel_agp failover intel_gtt qxl drm_kms_helper
> > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel
> > > > serio_raw agpgart virtio_blk virtio_console qemu_fw_cfg
> > > > [ 105.821561] CR2: ffffd2df8a000028
> > > > [ 105.822552] ---[ end trace d5f24e2cb83a2b76 ]---
> > > > [ 105.823659] RIP: 0010:z3fold_zpool_map+0x52/0x110
> > > > [ 105.824785] Code: e8 48 01 ea 0f 82 ca 00 00 00 48 c7 c3 00 00 00
> > > > 80 48 2b 1d 70 eb e4 00 48 01 d3 48 c1 eb 0c 48 c1 e3 06 48 03 1d 4e
> > > > eb e4 00 <48> 8b 53 28 83 e2 01 74 07 5b 5d 41 5c 41 5d c3 4c 8d 6d 10
> > > > 4c 89
> > > > [ 105.828082] RSP: 0018:ffffa82d809a33f8 EFLAGS: 00010286
> > > > [ 105.829287] RAX: 0000000000000000 RBX: ffffd2df8a000000 RCX: 0000000000000000
> > > > [ 105.830713] RDX: 0000000080000000 RSI: ffff90edbab538d8 RDI: ffff90edb5fdd600
> > > > [ 105.832157] RBP: 0000000000000000 R08: ffff90edb5fdd600 R09: 0000000000000000
> > > > [ 105.833607] R10: 0000000000000003 R11: 0000000000000000 R12: ffff90edbab538d8
> > > > [ 105.835054] R13: ffff90edb5fdd6a0 R14: ffff90edb5fdd600 R15: ffffa82d809a3438
> > > > [ 105.836489] FS: 00007ff6a887b740(0000) GS:ffff90edbe400000(0000)
> > > > knlGS:0000000000000000
> > > > [ 105.838103] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 105.839405] CR2: ffffd2df8a000028 CR3: 0000000036fde006 CR4: 0000000000160ef0
> > > > [ 105.840883] ------------[ cut here ]------------
> > > >
> > > >
> > > > (gdb) l *zswap_writeback_entry+0x50
> > > > 0xffffffff812e8490 is in zswap_writeback_entry (/src/linux/mm/zswap.c:858).
> > > > 853 .sync_mode = WB_SYNC_NONE,
> > > > 854 };
> > > > 855
> > > > 856 /* extract swpentry from data */
> > > > 857 zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO);
> > > > 858 swpentry = zhdr->swpentry; /* here */
> > > > 859 zpool_unmap_handle(pool, handle);
> > > > 860 tree = zswap_trees[swp_type(swpentry)];
> > > > 861 offset = swp_offset(swpentry);
> > > >
> > > >
> > > > (gdb) l *z3fold_zpool_map+0x52
> > > > 0xffffffff81337b32 is in z3fold_zpool_map
> > > > (/src/linux/arch/x86/include/asm/bitops.h:207).
> > > > 202 return GEN_BINARY_RMWcc(LOCK_PREFIX __ASM_SIZE(btc), *addr, c, "Ir", nr);
> > > > 203 }
> > > > 204
> > > > 205 static __always_inline bool constant_test_bit(long nr, const
> > > > volatile unsigned long *addr)
> > > > 206 {
> > > > 207 return ((1UL << (nr & (BITS_PER_LONG-1))) &
> > > > 208 (addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
> > > > 209 }
> > > > 210
> > > > 211 static __always_inline bool variable_test_bit(long nr, volatile
> > > > const unsigned long *addr)
> > > >
> > > >
> > > > (gdb) l *z3fold_zpool_shrink+0x4a6
> > > > 0xffffffff81338796 is in z3fold_zpool_shrink (/src/linux/mm/z3fold.c:1173).
> > > > 1168 ret = pool->ops->evict(pool, first_handle);
> > > > 1169 if (ret)
> > > > 1170 goto next;
> > > > 1171 }
> > > > 1172 if (last_handle) {
> > > > 1173 ret = pool->ops->evict(pool, last_handle);
> > > > 1174 if (ret)
> > > > 1175 goto next;
> > > > 1176 }
> > > > 1177 next:
> > > >
> > > >
> > > > Because of test setup and swapping, usually ssh/shell etc are stuck
> > > > and it is not possible to get dmesg of other situations. So I've used
> > > > console logging. It misses other boot messages though. They should be
> > > > about the same as 1st case.
> > > >
> > > >
> > > > 2st console log attached: console-1566133726.340057021.log
> > > >
> > > > [ 14.324867] general protection fault: 0000 [#1] SMP PTI
> > > > [ 14.330269] CPU: 1 PID: 150 Comm: kswapd0 Tainted: G W
> > > > 5.3.0-rc4 #69
> > > > [ 14.331359] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 14.332511] RIP: 0010:handle_to_buddy+0x20/0x30
> > > > [ 14.333478] Code: 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 53
> > > > 48 89 fb 83 e7 01 0f 85 01 26 00 00 48 8b 03 5b 48 89 c2 48 81 e2 00
> > > > f0 ff ff <0f> b6 92 ca 00 00 00 29 d0 83 e0 03 c3 0f 1f 00 0f 1f 44 00
> > > > 00 55
> > > > [ 14.336310] RSP: 0000:ffffb6cc0019f820 EFLAGS: 00010206
> > > > [ 14.337112] RAX: 00ffff8b24c22ed0 RBX: fffff46a4008bb40 RCX: 0000000000000000
> > > > [ 14.338174] RDX: 00ffff8b24c22000 RSI: ffff8b24fe7d89c8 RDI: ffff8b24fe7d89c8
> > > > [ 14.339112] RBP: ffff8b24c22ed000 R08: ffff8b24fe7d89c8 R09: 0000000000000000
> > > > [ 14.340407] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b24c22ed001
> > > > [ 14.341445] R13: ffff8b24c22ed010 R14: ffff8b24f5f70a00 R15: ffffb6cc0019f868
> > > > [ 14.342439] FS: 0000000000000000(0000) GS:ffff8b24fe600000(0000)
> > > > knlGS:0000000000000000
> > > > [ 14.343937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 14.344771] CR2: 00007f37563d4010 CR3: 0000000008212005 CR4: 0000000000160ee0
> > > > [ 14.345816] Call Trace:
> > > > [ 14.346182] z3fold_zpool_map+0x76/0x110
> > > > [ 14.347111] zswap_writeback_entry+0x50/0x410
> > > > [ 14.347828] z3fold_zpool_shrink+0x3c4/0x540
> > > > [ 14.348457] zswap_frontswap_store+0x424/0x7c1
> > > > [ 14.349134] __frontswap_store+0xc4/0x162
> > > > [ 14.349746] swap_writepage+0x39/0x70
> > > > [ 14.350292] pageout.isra.0+0x12c/0x5d0
> > > > [ 14.350899] shrink_page_list+0x1124/0x1830
> > > > [ 14.351473] shrink_inactive_list+0x1da/0x460
> > > > [ 14.352068] shrink_node_memcg+0x202/0x770
> > > > [ 14.352697] shrink_node+0xdc/0x4a0
> > > > [ 14.353204] balance_pgdat+0x2e7/0x580
> > > > [ 14.353773] kswapd+0x239/0x500
> > > > [ 14.354241] ? finish_wait+0x90/0x90
> > > > [ 14.355003] kthread+0x108/0x140
> > > > [ 14.355619] ? balance_pgdat+0x580/0x580
> > > > [ 14.356216] ? kthread_park+0x80/0x80
> > > > [ 14.356782] ret_from_fork+0x3a/0x50
> > > > [ 14.357859] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_net net_failover
> > > > virtio_balloon failover intel_agp intel_gtt qxl drm_kms_helper
> > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel
> > > > serio_raw virtio_blk virtio_console agpgart qemu_fw_cfg
> > > > [ 14.369818] ---[ end trace 351ba6e5814522bd ]---
> > > >
> > > >
> > > > (gdb) l *z3fold_zpool_map+0x76
> > > > 0xffffffff81337b56 is in z3fold_zpool_map (/src/linux/mm/z3fold.c:1239).
> > > > 1234 if (test_bit(PAGE_HEADLESS, &page->private))
> > > > 1235 goto out;
> > > > 1236
> > > > 1237 z3fold_page_lock(zhdr);
> > > > 1238 buddy = handle_to_buddy(handle);
> > > > 1239 switch (buddy) {
> > > > 1240 case FIRST:
> > > > 1241 addr += ZHDR_SIZE_ALIGNED;
> > > > 1242 break;
> > > > 1243 case MIDDLE:
> > > >
> > > > (gdb) l *z3fold_zpool_shrink+0x3c4
> > > > 0xffffffff813386b4 is in z3fold_zpool_shrink (/src/linux/mm/z3fold.c:1168).
> > > > 1163 ret = pool->ops->evict(pool, middle_handle);
> > > > 1164 if (ret)
> > > > 1165 goto next;
> > > > 1166 }
> > > > 1167 if (first_handle) {
> > > > 1168 ret = pool->ops->evict(pool, first_handle);
> > > > 1169 if (ret)
> > > > 1170 goto next;
> > > > 1171 }
> > > > 1172 if (last_handle) {
> > > >
> > > > (gdb) l *handle_to_buddy+0x20
> > > > 0xffffffff81337550 is in handle_to_buddy (/src/linux/mm/z3fold.c:425).
> > > > 420 unsigned long addr;
> > > > 421
> > > > 422 WARN_ON(handle & (1 << PAGE_HEADLESS));
> > > > 423 addr = *(unsigned long *)handle;
> > > > 424 zhdr = (struct z3fold_header *)(addr & PAGE_MASK);
> > > > 425 return (addr - zhdr->first_num) & BUDDY_MASK;
> > > > 426 }
> > > > 427
> > > > 428 static inline struct z3fold_pool *zhdr_to_pool(struct z3fold_header *zhdr)
> > > > 429 {
> > > >
> > > >
> > > > 3st console log attached: console-1566146080.512045588.log
> > > >
> > > > [ 4180.615506] kernel BUG at lib/list_debug.c:54!
> > > > [ 4180.617034] invalid opcode: 0000 [#1] SMP PTI
> > > > [ 4180.618059] CPU: 3 PID: 2129 Comm: stress Tainted: G W
> > > > 5.3.0-rc4 #69
> > > > [ 4180.619811] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 4180.621757] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x55
> > > > [ 4180.623035] Code: c7 c7 20 fb 11 8f e8 55 7e bf ff 0f 0b 48 89 fe
> > > > 48 c7 c7 b0 fb 11 8f e8 44 7e bf ff 0f 0b 48 c7 c7 60 fc 11 8f e8 36
> > > > 7e bf ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 20 fc 11 8f e8 22 7e bf ff
> > > > 0f 0b
> > > > [ 4180.627262] RSP: 0000:ffffacfcc097f4c8 EFLAGS: 00010246
> > > > [ 4180.628459] RAX: 0000000000000054 RBX: ffff88a102053000 RCX: 0000000000000000
> > > > [ 4180.630077] RDX: 0000000000000000 RSI: ffff88a13bbd89c8 RDI: ffff88a13bbd89c8
> > > > [ 4180.631693] RBP: ffff88a102053000 R08: ffff88a13bbd89c8 R09: 0000000000000000
> > > > [ 4180.633271] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88a13098a200
> > > > [ 4180.634899] R13: ffff88a13098a208 R14: 0000000000000000 R15: ffff88a102053010
> > > > [ 4180.636539] FS: 00007f86b900e740(0000) GS:ffff88a13ba00000(0000)
> > > > knlGS:0000000000000000
> > > > [ 4180.638394] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 4180.639733] CR2: 00007f86b1e1f010 CR3: 000000002f21e002 CR4: 0000000000160ee0
> > > > [ 4180.641383] Call Trace:
> > > > [ 4180.641965] z3fold_zpool_malloc+0x106/0xa40
> > > > [ 4180.642965] zswap_frontswap_store+0x2e8/0x7c1
> > > > [ 4180.643978] __frontswap_store+0xc4/0x162
> > > > [ 4180.644875] swap_writepage+0x39/0x70
> > > > [ 4180.645695] pageout.isra.0+0x12c/0x5d0
> > > > [ 4180.646553] shrink_page_list+0x1124/0x1830
> > > > [ 4180.647538] shrink_inactive_list+0x1da/0x460
> > > > [ 4180.648564] shrink_node_memcg+0x202/0x770
> > > > [ 4180.649529] ? sched_clock_cpu+0xc/0xc0
> > > > [ 4180.650432] shrink_node+0xdc/0x4a0
> > > > [ 4180.651258] do_try_to_free_pages+0xdb/0x3c0
> > > > [ 4180.652261] try_to_free_pages+0x112/0x2e0
> > > > [ 4180.653217] __alloc_pages_slowpath+0x422/0x1000
> > > > [ 4180.654294] ? __lock_acquire+0x247/0x1900
> > > > [ 4180.655254] __alloc_pages_nodemask+0x37f/0x400
> > > > [ 4180.656312] alloc_pages_vma+0x79/0x1e0
> > > > [ 4180.657169] __read_swap_cache_async+0x1ec/0x3e0
> > > > [ 4180.658197] swap_cluster_readahead+0x184/0x330
> > > > [ 4180.659211] ? find_held_lock+0x32/0x90
> > > > [ 4180.660111] swapin_readahead+0x2b4/0x4e0
> > > > [ 4180.661046] ? sched_clock_cpu+0xc/0xc0
> > > > [ 4180.661949] do_swap_page+0x3ac/0xc30
> > > > [ 4180.662807] __handle_mm_fault+0x8dd/0x1900
> > > > [ 4180.663790] handle_mm_fault+0x159/0x340
> > > > [ 4180.664713] do_user_addr_fault+0x1fe/0x480
> > > > [ 4180.665691] do_page_fault+0x31/0x210
> > > > [ 4180.666552] page_fault+0x3e/0x50
> > > > [ 4180.667818] RIP: 0033:0x555b3127d298
> > > > [ 4180.669153] Code: 7e 01 00 00 89 df e8 47 e1 ff ff 44 8b 2d 84 4d
> > > > 00 00 4d 85 ff 7e 40 31 c0 eb 0f 0f 1f 80 00 00 00 00 4c 01 f0 49 39
> > > > c7 7e 2d <80> 7c 05 00 5a 4c 8d 54 05 00 74 ec 4c 89 14 24 45 85 ed 0f
> > > > 89 de
> > > > [ 4180.676117] RSP: 002b:00007ffc7a9f9bf0 EFLAGS: 00010206
> > > > [ 4180.678515] RAX: 0000000000038000 RBX: ffffffffffffffff RCX: 00007f86b9107156
> > > > [ 4180.681657] RDX: 0000000000000000 RSI: 000000000b805000 RDI: 0000000000000000
> > > > [ 4180.684762] RBP: 00007f86ad809010 R08: 00007f86ad809010 R09: 0000000000000000
> > > > [ 4180.687846] R10: 00007f86ad840010 R11: 0000000000000246 R12: 0000555b3127f004
> > > > [ 4180.690919] R13: 0000000000000002 R14: 0000000000001000 R15: 000000000b804000
> > > > [ 4180.693967] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_net virtio_balloon
> > > > net_failover intel_agp failover intel_gtt qxl drm_kms_helper
> > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel
> > > > serio_raw virtio_blk virtio_console agpgart qemu_fw_cfg
> > > > [ 4180.715768] ---[ end trace 6eab0ae003d4d2ea ]---
> > > > [ 4180.718021] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x55
> > > > [ 4180.720602] Code: c7 c7 20 fb 11 8f e8 55 7e bf ff 0f 0b 48 89 fe
> > > > 48 c7 c7 b0 fb 11 8f e8 44 7e bf ff 0f 0b 48 c7 c7 60 fc 11 8f e8 36
> > > > 7e bf ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 20 fc 11 8f e8 22 7e bf ff
> > > > 0f 0b
> > > > [ 4180.728474] RSP: 0000:ffffacfcc097f4c8 EFLAGS: 00010246
> > > > [ 4180.730969] RAX: 0000000000000054 RBX: ffff88a102053000 RCX: 0000000000000000
> > > > [ 4180.734130] RDX: 0000000000000000 RSI: ffff88a13bbd89c8 RDI: ffff88a13bbd89c8
> > > > [ 4180.737285] RBP: ffff88a102053000 R08: ffff88a13bbd89c8 R09: 0000000000000000
> > > > [ 4180.740442] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88a13098a200
> > > > [ 4180.743609] R13: ffff88a13098a208 R14: 0000000000000000 R15: ffff88a102053010
> > > > [ 4180.746774] FS: 00007f86b900e740(0000) GS:ffff88a13ba00000(0000)
> > > > knlGS:0000000000000000
> > > > [ 4180.750294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 4180.752986] CR2: 00007f86b1e1f010 CR3: 000000002f21e002 CR4: 0000000000160ee0
> > > > [ 4180.756176] ------------[ cut here ]------------
> > > >
> > > > (gdb) l *z3fold_zpool_malloc+0x106
> > > > 0xffffffff81338936 is in z3fold_zpool_malloc
> > > > (/src/linux/include/linux/list.h:190).
> > > > 185 * list_del_init - deletes entry from list and reinitialize it.
> > > > 186 * @entry: the element to delete from the list.
> > > > 187 */
> > > > 188 static inline void list_del_init(struct list_head *entry)
> > > > 189 {
> > > > 190 __list_del_entry(entry);
> > > > 191 INIT_LIST_HEAD(entry);
> > > > 192 }
> > > > 193
> > > > 194 /**
> > > >
> > > > (gdb) l *zswap_frontswap_store+0x2e8
> > > > 0xffffffff812e8b38 is in zswap_frontswap_store (/src/linux/mm/zswap.c:1073).
> > > > 1068 goto put_dstmem;
> > > > 1069 }
> > > > 1070
> > > > 1071 /* store */
> > > > 1072 hlen = zpool_evictable(entry->pool->zpool) ? sizeof(zhdr) : 0;
> > > > 1073 ret = zpool_malloc(entry->pool->zpool, hlen + dlen,
> > > > 1074 __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM,
> > > > 1075 &handle);
> > > > 1076 if (ret == -ENOSPC) {
> > > > 1077 zswap_reject_compress_poor++;
> > > >
> > > >
> > > > 4th console log attached: console-1566151496.204958451.log
> > > >
> > > > [ 66.090333] BUG: unable to handle page fault for address: ffffeab2e2000028
> > > > [ 66.091245] #PF: supervisor read access in kernel mode
> > > > [ 66.091904] #PF: error_code(0x0000) - not-present page
> > > > [ 66.092552] PGD 0 P4D 0
> > > > [ 66.092885] Oops: 0000 [#1] SMP PTI
> > > > [ 66.093332] CPU: 2 PID: 1193 Comm: stress Not tainted 5.3.0-rc4 #69
> > > > [ 66.094127] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 66.095204] RIP: 0010:z3fold_zpool_map+0x52/0x110
> > > > [ 66.095799] Code: e8 48 01 ea 0f 82 ca 00 00 00 48 c7 c3 00 00 00
> > > > 80 48 2b 1d 70 eb e4 00 48 01 d3 48 c1 eb 0c 48 c1 e3 06 48 03 1d 4e
> > > > eb e4 00 <48> 8b 53 28 83 e2 01 74 07 5b 5d 41 5c 41 5d c3 4c 8d 6d 10
> > > > 4c 89
> > > > [ 66.098132] RSP: 0000:ffffb7a2009375e8 EFLAGS: 00010286
> > > > [ 66.098792] RAX: 0000000000000000 RBX: ffffeab2e2000000 RCX: 0000000000000000
> > > > [ 66.099685] RDX: 0000000080000000 RSI: ffff9f67bb10e688 RDI: ffff9f67b39bca00
> > > > [ 66.100579] RBP: 0000000000000000 R08: ffff9f67b39bca00 R09: 0000000000000000
> > > > [ 66.101477] R10: 0000000000000003 R11: 0000000000000000 R12: ffff9f67bb10e688
> > > > [ 66.102367] R13: ffff9f67b39bcaa0 R14: ffff9f67b39bca00 R15: ffffb7a200937628
> > > > [ 66.103263] FS: 00007f33df62b740(0000) GS:ffff9f67be800000(0000)
> > > > knlGS:0000000000000000
> > > > [ 66.104264] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 66.104988] CR2: ffffeab2e2000028 CR3: 000000003798a001 CR4: 0000000000160ee0
> > > > [ 66.105878] Call Trace:
> > > > [ 66.106202] zswap_writeback_entry+0x50/0x410
> > > > [ 66.106761] z3fold_zpool_shrink+0x29d/0x540
> > > > [ 66.107305] zswap_frontswap_store+0x424/0x7c1
> > > > [ 66.107870] __frontswap_store+0xc4/0x162
> > > > [ 66.108383] swap_writepage+0x39/0x70
> > > > [ 66.108847] pageout.isra.0+0x12c/0x5d0
> > > > [ 66.109340] shrink_page_list+0x1124/0x1830
> > > > [ 66.109872] shrink_inactive_list+0x1da/0x460
> > > > [ 66.110430] shrink_node_memcg+0x202/0x770
> > > > [ 66.110955] shrink_node+0xdc/0x4a0
> > > > [ 66.111403] do_try_to_free_pages+0xdb/0x3c0
> > > > [ 66.111946] try_to_free_pages+0x112/0x2e0
> > > > [ 66.112468] __alloc_pages_slowpath+0x422/0x1000
> > > > [ 66.113064] ? __lock_acquire+0x247/0x1900
> > > > [ 66.113596] __alloc_pages_nodemask+0x37f/0x400
> > > > [ 66.114179] alloc_pages_vma+0x79/0x1e0
> > > > [ 66.114675] __handle_mm_fault+0x99c/0x1900
> > > > [ 66.115218] handle_mm_fault+0x159/0x340
> > > > [ 66.115719] do_user_addr_fault+0x1fe/0x480
> > > > [ 66.116256] do_page_fault+0x31/0x210
> > > > [ 66.116730] page_fault+0x3e/0x50
> > > > [ 66.117168] RIP: 0033:0x556945873250
> > > > [ 66.117624] Code: 0f 84 88 02 00 00 8b 54 24 0c 31 c0 85 d2 0f 94
> > > > c0 89 04 24 41 83 fd 02 0f 8f f1 00 00 00 31 c0 4d 85 ff 7e 12 0f 1f
> > > > 44 00 00 <c6> 44 05 00 5a 4c 01 f0 49 39 c7 7f f3 48 85 db 0f 84 dd 01
> > > > 00 00
> > > > [ 66.120514] RSP: 002b:00007fffa5fc06c0 EFLAGS: 00010206
> > > > [ 66.121722] RAX: 000000000a0ad000 RBX: ffffffffffffffff RCX: 00007f33df724156
> > > > [ 66.123171] RDX: 0000000000000000 RSI: 000000000b7a4000 RDI: 0000000000000000
> > > > [ 66.124616] RBP: 00007f33d3e87010 R08: 00007f33d3e87010 R09: 0000000000000000
> > > > [ 66.126064] R10: 0000000000000022 R11: 0000000000000246 R12: 0000556945875004
> > > > [ 66.127499] R13: 0000000000000002 R14: 0000000000001000 R15: 000000000b7a3000
> > > > [ 66.128936] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_balloon intel_agp virtio_net
> > > > net_failover failover intel_gtt qxl drm_kms_helper syscopyarea
> > > > sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel serio_raw
> > > > virtio_blk virtio_console agpgart qemu_fw_cfg
> > > > [ 66.138533] CR2: ffffeab2e2000028
> > > > [ 66.139562] ---[ end trace bfa9f40a545e4544 ]---
> > > > [ 66.140733] RIP: 0010:z3fold_zpool_map+0x52/0x110
> > > > [ 66.141886] Code: e8 48 01 ea 0f 82 ca 00 00 00 48 c7 c3 00 00 00
> > > > 80 48 2b 1d 70 eb e4 00 48 01 d3 48 c1 eb 0c 48 c1 e3 06 48 03 1d 4e
> > > > eb e4 00 <48> 8b 53 28 83 e2 01 74 07 5b 5d 41 5c 41 5d c3 4c 8d 6d 10
> > > > 4c 89
> > > > [ 66.145387] RSP: 0000:ffffb7a2009375e8 EFLAGS: 00010286
> > > > [ 66.146654] RAX: 0000000000000000 RBX: ffffeab2e2000000 RCX: 0000000000000000
> > > > [ 66.148137] RDX: 0000000080000000 RSI: ffff9f67bb10e688 RDI: ffff9f67b39bca00
> > > > [ 66.149626] RBP: 0000000000000000 R08: ffff9f67b39bca00 R09: 0000000000000000
> > > > [ 66.151128] R10: 0000000000000003 R11: 0000000000000000 R12: ffff9f67bb10e688
> > > > [ 66.152606] R13: ffff9f67b39bcaa0 R14: ffff9f67b39bca00 R15: ffffb7a200937628
> > > > [ 66.154076] FS: 00007f33df62b740(0000) GS:ffff9f67be800000(0000)
> > > > knlGS:0000000000000000
> > > > [ 66.155695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 66.157020] CR2: ffffeab2e2000028 CR3: 000000003798a001 CR4: 0000000000160ee0
> > > > [ 66.158535] ------------[ cut here ]------------
> > > >
> > > > (gdb) l *z3fold_zpool_shrink+0x29d
> > > > 0xffffffff8133858d is in z3fold_zpool_shrink (/src/linux/mm/z3fold.c:1168).
> > > > 1163 ret = pool->ops->evict(pool, middle_handle);
> > > > 1164 if (ret)
> > > > 1165 goto next;
> > > > 1166 }
> > > > 1167 if (first_handle) {
> > > > 1168 ret = pool->ops->evict(pool, first_handle);
> > > > 1169 if (ret)
> > > > 1170 goto next;
> > > > 1171 }
> > > > 1172 if (last_handle) {
> > > >
> > > >
> > > > 5th console log is: console-1566152424.019311951.log
> > > > [ 22.529023] kernel BUG at include/linux/mm.h:607!
> > > > [ 22.529092] BUG: kernel NULL pointer dereference, address: 0000000000000008
> > > > [ 22.531789] #PF: supervisor read access in kernel mode
> > > > [ 22.532954] #PF: error_code(0x0000) - not-present page
> > > > [ 22.533722] PGD 0 P4D 0
> > > > [ 22.534097] Oops: 0000 [#1] SMP PTI
> > > > [ 22.534585] CPU: 0 PID: 186 Comm: kworker/u8:4 Not tainted 5.3.0-rc4 #69
> > > > [ 22.535488] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 22.536633] Workqueue: zswap1 compact_page_work
> > > > [ 22.537263] RIP: 0010:__list_add_valid+0x3/0x40
> > > > [ 22.537868] Code: f4 ff ff ff e9 3a ff ff ff 49 c7 07 00 00 00 00
> > > > 41 c7 47 08 00 00 00 00 e9 66 ff ff ff e8 15 f6 b6 ff 90 90 90 90 90
> > > > 49 89 d0 <48> 8b 52 08 48 39 f2 0f 85 7c 00 00 00 4c 8b 0a 4d 39 c1 0f
> > > > 85 98
> > > > [ 22.540322] RSP: 0000:ffffa073802cfdf8 EFLAGS: 00010206
> > > > [ 22.540953] RAX: 00000000000003c0 RBX: ffff8d69ad052000 RCX: 8888888888888889
> > > > [ 22.541838] RDX: 0000000000000000 RSI: ffffc0737f6012e8 RDI: ffff8d69ad052000
> > > > [ 22.542747] RBP: ffffc0737f6012e8 R08: 0000000000000000 R09: 0000000000000001
> > > > [ 22.543660] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> > > > [ 22.544614] R13: ffff8d69bd0dfc00 R14: ffff8d69bd0dfc08 R15: ffff8d69ad052010
> > > > [ 22.545578] FS: 0000000000000000(0000) GS:ffff8d69be400000(0000)
> > > > knlGS:0000000000000000
> > > > [ 22.546662] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 22.547452] CR2: 0000000000000008 CR3: 0000000035304001 CR4: 0000000000160ef0
> > > > [ 22.548488] Call Trace:
> > > > [ 22.548845] do_compact_page+0x31e/0x430
> > > > [ 22.549406] process_one_work+0x272/0x5a0
> > > > [ 22.549972] worker_thread+0x50/0x3b0
> > > > [ 22.550488] kthread+0x108/0x140
> > > > [ 22.550939] ? process_one_work+0x5a0/0x5a0
> > > > [ 22.551531] ? kthread_park+0x80/0x80
> > > > [ 22.552034] ret_from_fork+0x3a/0x50
> > > > [ 22.552554] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_balloon virtio_net
> > > > net_failover intel_agp intel_gtt failover qxl drm_kms_helper
> > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel
> > > > serio_raw virtio_console virtio_blk agpgart qemu_fw_cfg
> > > > [ 22.559889] CR2: 0000000000000008
> > > > [ 22.560328] ---[ end trace cfa4596e38137687 ]---
> > > > [ 22.560330] invalid opcode: 0000 [#2] SMP PTI
> > > > [ 22.560981] RIP: 0010:__list_add_valid+0x3/0x40
> > > > [ 22.561515] CPU: 2 PID: 1063 Comm: stress Tainted: G D
> > > > 5.3.0-rc4 #69
> > > > [ 22.562143] Code: f4 ff ff ff e9 3a ff ff ff 49 c7 07 00 00 00 00
> > > > 41 c7 47 08 00 00 00 00 e9 66 ff ff ff e8 15 f6 b6 ff 90 90 90 90 90
> > > > 49 89 d0 <48> 8b 52 08 48 39 f2 0f 85 7c 00 00 00 4c 8b 0a 4d 39 c1 0f
> > > > 85 98
> > > > [ 22.563034] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > > > BIOS 1.12.0-2.fc30 04/01/2014
> > > > [ 22.565759] RSP: 0000:ffffa073802cfdf8 EFLAGS: 00010206
> > > > [ 22.565760] RAX: 00000000000003c0 RBX: ffff8d69ad052000 RCX: 8888888888888889
> > > > [ 22.565761] RDX: 0000000000000000 RSI: ffffc0737f6012e8 RDI: ffff8d69ad052000
> > > > [ 22.565761] RBP: ffffc0737f6012e8 R08: 0000000000000000 R09: 0000000000000001
> > > > [ 22.565762] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> > > > [ 22.565763] R13: ffff8d69bd0dfc00 R14: ffff8d69bd0dfc08 R15: ffff8d69ad052010
> > > > [ 22.565765] FS: 0000000000000000(0000) GS:ffff8d69be400000(0000)
> > > > knlGS:0000000000000000
> > > > [ 22.565766] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 22.565766] CR2: 0000000000000008 CR3: 0000000035304001 CR4: 0000000000160ef0
> > > > [ 22.565797] note: kworker/u8:4[186] exited with preempt_count 3
> > > > [ 22.581957] RIP: 0010:__free_pages+0x2d/0x30
> > > > [ 22.583146] Code: 00 00 8b 47 34 85 c0 74 15 f0 ff 4f 34 75 09 85
> > > > f6 75 06 e9 75 ff ff ff c3 e9 4f e2 ff ff 48 c7 c6 e8 8c 0a bb e8 d3
> > > > 7f fd ff <0f> 0b 90 0f 1f 44 00 00 89 f1 41 bb 01 00 00 00 49 89 fa 41
> > > > d3 e3
> > > > [ 22.586649] RSP: 0018:ffffa073809ef4d0 EFLAGS: 00010246
> > > > [ 22.587963] RAX: 000000000000003e RBX: ffff8d6992d10000 RCX: 0000000000000006
> > > > [ 22.589579] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffbb0e5774
> > > > [ 22.591181] RBP: ffffd090004b4408 R08: 000000053ed5634a R09: 0000000000000000
> > > > [ 22.592781] R10: 0000000000000000 R11: 0000000000000000 R12: ffffd090004b4400
> > > > [ 22.594339] R13: ffff8d69bd0dfca0 R14: ffff8d69bd0dfc00 R15: ffff8d69bd0dfc08
> > > > [ 22.595832] FS: 00007f48316b7740(0000) GS:ffff8d69be800000(0000)
> > > > knlGS:0000000000000000
> > > > [ 22.598649] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 22.601196] CR2: 00007fbcae5049b0 CR3: 00000000352fe002 CR4: 0000000000160ee0
> > > > [ 22.603539] Call Trace:
> > > > [ 22.605103] z3fold_zpool_shrink+0x25f/0x540
> > > > [ 22.607218] zswap_frontswap_store+0x424/0x7c1
> > > > [ 22.609115] __frontswap_store+0xc4/0x162
> > > > [ 22.610819] swap_writepage+0x39/0x70
> > > > [ 22.612525] pageout.isra.0+0x12c/0x5d0
> > > > [ 22.613957] shrink_page_list+0x1124/0x1830
> > > > [ 22.615130] shrink_inactive_list+0x1da/0x460
> > > > [ 22.616311] shrink_node_memcg+0x202/0x770
> > > > [ 22.617473] ? sched_clock_cpu+0xc/0xc0
> > > > [ 22.619145] shrink_node+0xdc/0x4a0
> > > > [ 22.620279] do_try_to_free_pages+0xdb/0x3c0
> > > > [ 22.621450] try_to_free_pages+0x112/0x2e0
> > > > [ 22.622582] __alloc_pages_slowpath+0x422/0x1000
> > > > [ 22.623749] ? __lock_acquire+0x247/0x1900
> > > > [ 22.624876] __alloc_pages_nodemask+0x37f/0x400
> > > > [ 22.626007] alloc_pages_vma+0x79/0x1e0
> > > > [ 22.627040] __read_swap_cache_async+0x1ec/0x3e0
> > > > [ 22.628143] swap_cluster_readahead+0x184/0x330
> > > > [ 22.629234] ? find_held_lock+0x32/0x90
> > > > [ 22.630292] swapin_readahead+0x2b4/0x4e0
> > > > [ 22.631370] ? sched_clock_cpu+0xc/0xc0
> > > > [ 22.632379] do_swap_page+0x3ac/0xc30
> > > > [ 22.633356] __handle_mm_fault+0x8dd/0x1900
> > > > [ 22.634373] handle_mm_fault+0x159/0x340
> > > > [ 22.635714] do_user_addr_fault+0x1fe/0x480
> > > > [ 22.636738] do_page_fault+0x31/0x210
> > > > [ 22.637674] page_fault+0x3e/0x50
> > > > [ 22.638559] RIP: 0033:0x562b503bd298
> > > > [ 22.639476] Code: 7e 01 00 00 89 df e8 47 e1 ff ff 44 8b 2d 84 4d
> > > > 00 00 4d 85 ff 7e 40 31 c0 eb 0f 0f 1f 80 00 00 00 00 4c 01 f0 49 39
> > > > c7 7e 2d <80> 7c 05 00 5a 4c 8d 54 05 00 74 ec 4c 89 14 24 45 85 ed 0f
> > > > 89 de
> > > > [ 22.642658] RSP: 002b:00007ffd83e31e80 EFLAGS: 00010206
> > > > [ 22.643900] RAX: 0000000000f09000 RBX: ffffffffffffffff RCX: 00007f48317b0156
> > > > [ 22.645242] RDX: 0000000000000000 RSI: 000000000b276000 RDI: 0000000000000000
> > > > [ 22.646571] RBP: 00007f4826441010 R08: 00007f4826441010 R09: 0000000000000000
> > > > [ 22.647888] R10: 00007f4827349010 R11: 0000000000000246 R12: 0000562b503bf004
> > > > [ 22.649210] R13: 0000000000000002 R14: 0000000000001000 R15: 000000000b275800
> > > > [ 22.650518] Modules linked in: ip6t_rpfilter ip6t_REJECT
> > > > nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat
> > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
> > > > iptable_mangle iptable_raw iptable_security nf_conntrack
> > > > nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink
> > > > ip6table_filter ip6_tables iptable_filter ip_tables crct10dif_pclmul
> > > > crc32_pclmul ghash_clmulni_intel virtio_balloon virtio_net
> > > > net_failover intel_agp intel_gtt failover qxl drm_kms_helper
> > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel
> > > > serio_raw virtio_console virtio_blk agpgart qemu_fw_cfg
> > > > [ 22.659276] ---[ end trace cfa4596e38137688 ]---
> > > > [ 22.660398] RIP: 0010:__list_add_valid+0x3/0x40
> > > > [ 22.661493] Code: f4 ff ff ff e9 3a ff ff ff 49 c7 07 00 00 00 00
> > > > 41 c7 47 08 00 00 00 00 e9 66 ff ff ff e8 15 f6 b6 ff 90 90 90 90 90
> > > > 49 89 d0 <48> 8b 52 08 48 39 f2 0f 85 7c 00 00 00 4c 8b 0a 4d 39 c1 0f
> > > > 85 98
> > > > [ 22.664800] RSP: 0000:ffffa073802cfdf8 EFLAGS: 00010206
> > > > [ 22.666779] RAX: 00000000000003c0 RBX: ffff8d69ad052000 RCX: 8888888888888889
> > > > [ 22.669830] RDX: 0000000000000000 RSI: ffffc0737f6012e8 RDI: ffff8d69ad052000
> > > > [ 22.672878] RBP: ffffc0737f6012e8 R08: 0000000000000000 R09: 0000000000000001
> > > > [ 22.675920] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
> > > > [ 22.678966] R13: ffff8d69bd0dfc00 R14: ffff8d69bd0dfc08 R15: ffff8d69ad052010
> > > > [ 22.682014] FS: 00007f48316b7740(0000) GS:ffff8d69be800000(0000)
> > > > knlGS:0000000000000000
> > > > [ 22.685399] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 22.687991] CR2: 00007fbcae5049b0 CR3: 00000000352fe002 CR4: 0000000000160ee0
> > > > [ 22.691068] ------------[ cut here ]------------
> > > >
> > > > (gdb) l *__list_add_valid+0x3
> > > > 0xffffffff81551b43 is in __list_add_valid
> > > > (/srv/s_maage/pkg/linux/linux/lib/list_debug.c:23).
> > > > 18 */
> > > > 19
> > > > 20 bool __list_add_valid(struct list_head *new, struct list_head *prev,
> > > > 21 struct list_head *next)
> > > > 22 {
> > > > 23 if (CHECK_DATA_CORRUPTION(next->prev != prev,
> > > > 24 "list_add corruption. next->prev should be prev (%px), but was %px.
> > > > (next=%px).\n",
> > > > 25 prev, next->prev, next) ||
> > > > 26 CHECK_DATA_CORRUPTION(prev->next != next,
> > > > 27 "list_add corruption. prev->next should be next (%px), but was %px.
> > > > (prev=%px).\n",
> > > >
> > > > (gdb) l *do_compact_page+0x31e
> > > > 0xffffffff813396fe is in do_compact_page
> > > > (/srv/s_maage/pkg/linux/linux/include/linux/list.h:60).
> > > > 55 */
> > > > 56 static inline void __list_add(struct list_head *new,
> > > > 57 struct list_head *prev,
> > > > 58 struct list_head *next)
> > > > 59 {
> > > > 60 if (!__list_add_valid(new, prev, next))
> > > > 61 return;
> > > > 62
> > > > 63 next->prev = new;
> > > > 64 new->next = next;
> > > >
> > > > (gdb) l *z3fold_zpool_shrink+0x25f
> > > > 0xffffffff8133854f is in z3fold_zpool_shrink
> > > > (/srv/s_maage/pkg/linux/linux/arch/x86/include/asm/atomic64_64.h:102).
> > > > 97 *
> > > > 98 * Atomically decrements @v by 1.
> > > > 99 */
> > > > 100 static __always_inline void arch_atomic64_dec(atomic64_t *v)
> > > > 101 {
> > > > 102 asm volatile(LOCK_PREFIX "decq %0"
> > > > 103 : "=m" (v->counter)
> > > > 104 : "m" (v->counter) : "memory");
> > > > 105 }
> > > > 106 #define arch_atomic64_dec arch_atomic64_dec
> > > >
> > > > (gdb) l *zswap_frontswap_store+0x424
> > > > 0xffffffff812e8c74 is in zswap_frontswap_store
> > > > (/srv/s_maage/pkg/linux/linux/mm/zswap.c:955).
> > > > 950
> > > > 951 pool = zswap_pool_last_get();
> > > > 952 if (!pool)
> > > > 953 return -ENOENT;
> > > > 954
> > > > 955 ret = zpool_shrink(pool->zpool, 1, NULL);
> > > > 956
> > > > 957 zswap_pool_put(pool);
> > > > 958
> > > > 959 return ret;
> > > >
> > > >
> > > >
> > > > [7.] A small shell script or example program which triggers the
> > > > problem (if possible)
> > > >
> > > > for tmout in 10 10 10 20 20 20 30 120 $((3600/2)) 10; do
> > > > stress --vm $(($(nproc)+2)) --vm-bytes $(($(awk
> > > > '"'"'/MemAvail/{print $2}'"'"' /proc/meminfo)*1024/$(nproc)))
> > > > --timeout '"$tmout"
> > > > done
> > > >
> > > >
> > > > [8.] Environment
> > > >
> > > > My test machine is Fedora 30 (minimal install) virtual machine running
> > > > 4 vCPU and 1GiB RAM and 2GiB swap. Origninally I noticed the problem
> > > > in other machines (Fedora 30). I guess any amount of memory pressure
> > > > and zswap activation can cause problems.
> > > >
> > > > Test machine does only have whatever comes from install and whatever
> > > > is enabled by default. Then I've also enabled serial console
> > > > "console=tty0 console=ttyS0". Enabled passwordless sudo to help
> > > > testing and then installed "stress."
> > > >
> > > > stress package version is stress-1.0.4-22.fc30
> > > >
> > > >
> > > > [8.1.] Software (add the output of the ver_linux script here)
> > > >
> > > > $ ./ver_linux
> > > > If some fields are empty or look unusual you may have an old version.
> > > > Compare to the current minimal requirements in Documentation/Changes.
> > > >
> > > > Linux localhost.localdomain 5.3.0-rc4 #69 SMP Fri Aug 16 19:52:23 EEST
> > > > 2019 x86_64 x86_64 x86_64 GNU/Linux
> > > >
> > > > Util-linux 2.33.2
> > > > Mount 2.33.2
> > > > Module-init-tools 25
> > > > E2fsprogs 1.44.6
> > > > Linux C Library 2.29
> > > > Dynamic linker (ldd) 2.29
> > > > Linux C++ Library 6.0.26
> > > > Procps 3.3.15
> > > > Kbd 2.0.4
> > > > Console-tools 2.0.4
> > > > Sh-utils 8.31
> > > > Udev 241
> > > > Modules Loaded agpgart crc32c_intel crc32_pclmul crct10dif_pclmul
> > > > drm drm_kms_helper failover fb_sys_fops ghash_clmulni_intel intel_agp
> > > > intel_gtt ip6table_filter ip6table_mangle ip6table_nat ip6table_raw
> > > > ip6_tables ip6table_security ip6t_REJECT ip6t_rpfilter ip_set
> > > > iptable_filter iptable_mangle iptable_nat iptable_raw ip_tables
> > > > iptable_security ipt_REJECT libcrc32c net_failover nf_conntrack
> > > > nf_defrag_ipv4 nf_defrag_ipv6 nf_nat nfnetlink nf_reject_ipv4
> > > > nf_reject_ipv6 qemu_fw_cfg qxl serio_raw syscopyarea sysfillrect
> > > > sysimgblt ttm virtio_balloon virtio_blk virtio_console virtio_net
> > > > xt_conntrack
> > > >
> > > >
> > > > [8.2.] Processor information (from /proc/cpuinfo):
> > > >
> > > > $ cat /proc/cpuinfo
> > > > processor : 0
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 60
> > > > model name : Intel Core Processor (Haswell, no TSX, IBRS)
> > > > stepping : 1
> > > > microcode : 0x1
> > > > cpu MHz : 3198.099
> > > > cache size : 16384 KB
> > > > physical id : 0
> > > > siblings : 1
> > > > core id : 0
> > > > cpu cores : 1
> > > > apicid : 0
> > > > initial apicid : 0
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 13
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > > > pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
> > > > constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma
> > > > cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
> > > > xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault
> > > > invpcid_single pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept
> > > > vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
> > > > xsaveopt arat umip md_clear
> > > > bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
> > > > bogomips : 6396.19
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 40 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 1
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 60
> > > > model name : Intel Core Processor (Haswell, no TSX, IBRS)
> > > > stepping : 1
> > > > microcode : 0x1
> > > > cpu MHz : 3198.099
> > > > cache size : 16384 KB
> > > > physical id : 1
> > > > siblings : 1
> > > > core id : 0
> > > > cpu cores : 1
> > > > apicid : 1
> > > > initial apicid : 1
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 13
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > > > pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
> > > > constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma
> > > > cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
> > > > xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault
> > > > invpcid_single pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept
> > > > vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
> > > > xsaveopt arat umip md_clear
> > > > bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
> > > > bogomips : 6468.62
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 40 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 2
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 60
> > > > model name : Intel Core Processor (Haswell, no TSX, IBRS)
> > > > stepping : 1
> > > > microcode : 0x1
> > > > cpu MHz : 3198.099
> > > > cache size : 16384 KB
> > > > physical id : 2
> > > > siblings : 1
> > > > core id : 0
> > > > cpu cores : 1
> > > > apicid : 2
> > > > initial apicid : 2
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 13
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > > > pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
> > > > constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma
> > > > cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
> > > > xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault
> > > > invpcid_single pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept
> > > > vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
> > > > xsaveopt arat umip md_clear
> > > > bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
> > > > bogomips : 6627.92
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 40 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > > processor : 3
> > > > vendor_id : GenuineIntel
> > > > cpu family : 6
> > > > model : 60
> > > > model name : Intel Core Processor (Haswell, no TSX, IBRS)
> > > > stepping : 1
> > > > microcode : 0x1
> > > > cpu MHz : 3198.099
> > > > cache size : 16384 KB
> > > > physical id : 3
> > > > siblings : 1
> > > > core id : 0
> > > > cpu cores : 1
> > > > apicid : 3
> > > > initial apicid : 3
> > > > fpu : yes
> > > > fpu_exception : yes
> > > > cpuid level : 13
> > > > wp : yes
> > > > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> > > > pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
> > > > constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq vmx ssse3 fma
> > > > cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
> > > > xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault
> > > > invpcid_single pti ssbd ibrs ibpb tpr_shadow vnmi flexpriority ept
> > > > vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
> > > > xsaveopt arat umip md_clear
> > > > bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
> > > > bogomips : 6662.16
> > > > clflush size : 64
> > > > cache_alignment : 64
> > > > address sizes : 40 bits physical, 48 bits virtual
> > > > power management:
> > > >
> > > >
> > > > [8.3.] Module information (from /proc/modules):
> > > >
> > > > $ cat /proc/modules
> > > > ip6t_rpfilter 16384 1 - Live 0x0000000000000000
> > > > ip6t_REJECT 16384 2 - Live 0x0000000000000000
> > > > nf_reject_ipv6 20480 1 ip6t_REJECT, Live 0x0000000000000000
> > > > ipt_REJECT 16384 2 - Live 0x0000000000000000
> > > > nf_reject_ipv4 16384 1 ipt_REJECT, Live 0x0000000000000000
> > > > xt_conntrack 16384 13 - Live 0x0000000000000000
> > > > ip6table_nat 16384 1 - Live 0x0000000000000000
> > > > ip6table_mangle 16384 1 - Live 0x0000000000000000
> > > > ip6table_raw 16384 1 - Live 0x0000000000000000
> > > > ip6table_security 16384 1 - Live 0x0000000000000000
> > > > iptable_nat 16384 1 - Live 0x0000000000000000
> > > > nf_nat 126976 2 ip6table_nat,iptable_nat, Live 0x0000000000000000
> > > > iptable_mangle 16384 1 - Live 0x0000000000000000
> > > > iptable_raw 16384 1 - Live 0x0000000000000000
> > > > iptable_security 16384 1 - Live 0x0000000000000000
> > > > nf_conntrack 241664 2 xt_conntrack,nf_nat, Live 0x0000000000000000
> > > > nf_defrag_ipv6 24576 1 nf_conntrack, Live 0x0000000000000000
> > > > nf_defrag_ipv4 16384 1 nf_conntrack, Live 0x0000000000000000
> > > > libcrc32c 16384 2 nf_nat,nf_conntrack, Live 0x0000000000000000
> > > > ip_set 69632 0 - Live 0x0000000000000000
> > > > nfnetlink 20480 1 ip_set, Live 0x0000000000000000
> > > > ip6table_filter 16384 1 - Live 0x0000000000000000
> > > > ip6_tables 36864 7
> > > > ip6table_nat,ip6table_mangle,ip6table_raw,ip6table_security,ip6table_filter,
> > > > Live 0x0000000000000000
> > > > iptable_filter 16384 1 - Live 0x0000000000000000
> > > > ip_tables 32768 5
> > > > iptable_nat,iptable_mangle,iptable_raw,iptable_security,iptable_filter,
> > > > Live 0x0000000000000000
> > > > crct10dif_pclmul 16384 1 - Live 0x0000000000000000
> > > > crc32_pclmul 16384 0 - Live 0x0000000000000000
> > > > ghash_clmulni_intel 16384 0 - Live 0x0000000000000000
> > > > virtio_net 61440 0 - Live 0x0000000000000000
> > > > virtio_balloon 24576 0 - Live 0x0000000000000000
> > > > net_failover 24576 1 virtio_net, Live 0x0000000000000000
> > > > failover 16384 1 net_failover, Live 0x0000000000000000
> > > > intel_agp 24576 0 - Live 0x0000000000000000
> > > > intel_gtt 24576 1 intel_agp, Live 0x0000000000000000
> > > > qxl 77824 0 - Live 0x0000000000000000
> > > > drm_kms_helper 221184 3 qxl, Live 0x0000000000000000
> > > > syscopyarea 16384 1 drm_kms_helper, Live 0x0000000000000000
> > > > sysfillrect 16384 1 drm_kms_helper, Live 0x0000000000000000
> > > > sysimgblt 16384 1 drm_kms_helper, Live 0x0000000000000000
> > > > fb_sys_fops 16384 1 drm_kms_helper, Live 0x0000000000000000
> > > > ttm 126976 1 qxl, Live 0x0000000000000000
> > > > drm 602112 4 qxl,drm_kms_helper,ttm, Live 0x0000000000000000
> > > > crc32c_intel 24576 5 - Live 0x0000000000000000
> > > > serio_raw 20480 0 - Live 0x0000000000000000
> > > > virtio_blk 20480 3 - Live 0x0000000000000000
> > > > virtio_console 45056 0 - Live 0x0000000000000000
> > > > qemu_fw_cfg 20480 0 - Live 0x0000000000000000
> > > > agpgart 53248 4 intel_agp,intel_gtt,ttm,drm, Live 0x0000000000000000
> > > >
> > > >
> > > > [8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
> > > >
> > > > $ cat /proc/ioports
> > > > 0000-0000 : PCI Bus 0000:00
> > > > 0000-0000 : dma1
> > > > 0000-0000 : pic1
> > > > 0000-0000 : timer0
> > > > 0000-0000 : timer1
> > > > 0000-0000 : keyboard
> > > > 0000-0000 : keyboard
> > > > 0000-0000 : rtc0
> > > > 0000-0000 : dma page reg
> > > > 0000-0000 : pic2
> > > > 0000-0000 : dma2
> > > > 0000-0000 : fpu
> > > > 0000-0000 : vga+
> > > > 0000-0000 : serial
> > > > 0000-0000 : QEMU0002:00
> > > > 0000-0000 : fw_cfg_io
> > > > 0000-0000 : 0000:00:1f.0
> > > > 0000-0000 : ACPI PM1a_EVT_BLK
> > > > 0000-0000 : ACPI PM1a_CNT_BLK
> > > > 0000-0000 : ACPI PM_TMR
> > > > 0000-0000 : ACPI GPE0_BLK
> > > > 0000-0000 : 0000:00:1f.3
> > > > 0000-0000 : PCI conf1
> > > > 0000-0000 : PCI Bus 0000:00
> > > > 0000-0000 : PCI Bus 0000:01
> > > > 0000-0000 : PCI Bus 0000:02
> > > > 0000-0000 : PCI Bus 0000:03
> > > > 0000-0000 : PCI Bus 0000:04
> > > > 0000-0000 : PCI Bus 0000:05
> > > > 0000-0000 : PCI Bus 0000:06
> > > > 0000-0000 : PCI Bus 0000:07
> > > > 0000-0000 : 0000:00:01.0
> > > > 0000-0000 : 0000:00:1f.2
> > > > 0000-0000 : ahci
> > > >
> > > > $ cat /proc/iomem
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : System RAM
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : PCI Bus 0000:00
> > > > 00000000-00000000 : Video ROM
> > > > 00000000-00000000 : Adapter ROM
> > > > 00000000-00000000 : Adapter ROM
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : System ROM
> > > > 00000000-00000000 : System RAM
> > > > 00000000-00000000 : Kernel code
> > > > 00000000-00000000 : Kernel data
> > > > 00000000-00000000 : Kernel bss
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : PCI MMCONFIG 0000 [bus 00-ff]
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : PCI Bus 0000:00
> > > > 00000000-00000000 : 0000:00:01.0
> > > > 00000000-00000000 : 0000:00:01.0
> > > > 00000000-00000000 : PCI Bus 0000:07
> > > > 00000000-00000000 : PCI Bus 0000:06
> > > > 00000000-00000000 : PCI Bus 0000:05
> > > > 00000000-00000000 : PCI Bus 0000:04
> > > > 00000000-00000000 : 0000:04:00.0
> > > > 00000000-00000000 : PCI Bus 0000:03
> > > > 00000000-00000000 : 0000:03:00.0
> > > > 00000000-00000000 : PCI Bus 0000:02
> > > > 00000000-00000000 : 0000:02:00.0
> > > > 00000000-00000000 : xhci-hcd
> > > > 00000000-00000000 : PCI Bus 0000:01
> > > > 00000000-00000000 : 0000:01:00.0
> > > > 00000000-00000000 : 0000:01:00.0
> > > > 00000000-00000000 : 0000:00:1b.0
> > > > 00000000-00000000 : 0000:00:01.0
> > > > 00000000-00000000 : 0000:00:02.0
> > > > 00000000-00000000 : 0000:00:02.1
> > > > 00000000-00000000 : 0000:00:02.2
> > > > 00000000-00000000 : 0000:00:02.3
> > > > 00000000-00000000 : 0000:00:02.4
> > > > 00000000-00000000 : 0000:00:02.5
> > > > 00000000-00000000 : 0000:00:02.6
> > > > 00000000-00000000 : 0000:00:1f.2
> > > > 00000000-00000000 : ahci
> > > > 00000000-00000000 : PCI Bus 0000:07
> > > > 00000000-00000000 : PCI Bus 0000:06
> > > > 00000000-00000000 : 0000:06:00.0
> > > > 00000000-00000000 : virtio-pci-modern
> > > > 00000000-00000000 : PCI Bus 0000:05
> > > > 00000000-00000000 : 0000:05:00.0
> > > > 00000000-00000000 : virtio-pci-modern
> > > > 00000000-00000000 : PCI Bus 0000:04
> > > > 00000000-00000000 : 0000:04:00.0
> > > > 00000000-00000000 : virtio-pci-modern
> > > > 00000000-00000000 : PCI Bus 0000:03
> > > > 00000000-00000000 : 0000:03:00.0
> > > > 00000000-00000000 : virtio-pci-modern
> > > > 00000000-00000000 : PCI Bus 0000:02
> > > > 00000000-00000000 : PCI Bus 0000:01
> > > > 00000000-00000000 : 0000:01:00.0
> > > > 00000000-00000000 : virtio-pci-modern
> > > > 00000000-00000000 : IOAPIC 0
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : Local APIC
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : Reserved
> > > > 00000000-00000000 : PCI Bus 0000:00
> > > >
> > > >
> > > > [8.5.] PCI information ('lspci -vvv' as root)
> > > >
> > > > Attached as: lspci-vvv-5.3.0-rc4.txt
> > > >
> > > >
> > > > [8.6.] SCSI information (from /proc/scsi/scsi)
> > > >
> > > > $ cat //proc/scsi/scsi
> > > > Attached devices:
> > > > Host: scsi0 Channel: 00 Id: 00 Lun: 00
> > > > Vendor: QEMU Model: QEMU DVD-ROM Rev: 2.5+
> > > > Type: CD-ROM ANSI SCSI revision: 05
> > > >
> > > >
> > > > [8.7.] Other information that might be relevant to the problem
> > > >
> > > > During testing it looks like this:
> > > > $ egrep -r ^ /sys/module/zswap/parameters
> > > > /sys/module/zswap/parameters/same_filled_pages_enabled:Y
> > > > /sys/module/zswap/parameters/enabled:Y
> > > > /sys/module/zswap/parameters/max_pool_percent:20
> > > > /sys/module/zswap/parameters/compressor:lzo
> > > > /sys/module/zswap/parameters/zpool:z3fold
> > > >
> > > > $ cat /proc/meminfo
> > > > MemTotal: 983056 kB
> > > > MemFree: 377876 kB
> > > > MemAvailable: 660820 kB
> > > > Buffers: 14896 kB
> > > > Cached: 368028 kB
> > > > SwapCached: 0 kB
> > > > Active: 247500 kB
> > > > Inactive: 193120 kB
> > > > Active(anon): 58016 kB
> > > > Inactive(anon): 280 kB
> > > > Active(file): 189484 kB
> > > > Inactive(file): 192840 kB
> > > > Unevictable: 0 kB
> > > > Mlocked: 0 kB
> > > > SwapTotal: 4194300 kB
> > > > SwapFree: 4194300 kB
> > > > Dirty: 8 kB
> > > > Writeback: 0 kB
> > > > AnonPages: 57712 kB
> > > > Mapped: 81984 kB
> > > > Shmem: 596 kB
> > > > KReclaimable: 56272 kB
> > > > Slab: 128128 kB
> > > > SReclaimable: 56272 kB
> > > > SUnreclaim: 71856 kB
> > > > KernelStack: 2208 kB
> > > > PageTables: 1632 kB
> > > > NFS_Unstable: 0 kB
> > > > Bounce: 0 kB
> > > > WritebackTmp: 0 kB
> > > > CommitLimit: 4685828 kB
> > > > Committed_AS: 268512 kB
> > > > VmallocTotal: 34359738367 kB
> > > > VmallocUsed: 9764 kB
> > > > VmallocChunk: 0 kB
> > > > Percpu: 9312 kB
> > > > HardwareCorrupted: 0 kB
> > > > AnonHugePages: 0 kB
> > > > ShmemHugePages: 0 kB
> > > > ShmemPmdMapped: 0 kB
> > > > CmaTotal: 0 kB
> > > > CmaFree: 0 kB
> > > > HugePages_Total: 0
> > > > HugePages_Free: 0
> > > > HugePages_Rsvd: 0
> > > > HugePages_Surp: 0
> > > > Hugepagesize: 2048 kB
> > > > Hugetlb: 0 kB
> > > > DirectMap4k: 110452 kB
> > > > DirectMap2M: 937984 kB
> > > > DirectMap1G: 0 kB
> > > >
> > > >
> > > > [9.] Other notes
> > > >
> > > > My workaround is to disable zswap:
> > > >
> > > > sudo bash -c 'echo 0 > /sys/module/zswap/parameters/enabled'
> > > >
> > > >
> > > > Sometimes stress can die just because it is out of memory. Also some
> > > > other programs might die because of page allocation failures etc. But
> > > > that is not relevant here.
> > > >
> > > >
> > > > Generally stress command is actually like:
> > > >
> > > > stress --vm 6 --vm-bytes 228608000 --timeout 10
> > > >
> > > >
> > > > It seems to be essential to start and stop stress runs. Sometimes
> > > > problem does not trigger until much later. To be sure there is no
> > > > problems I'd suggest running stress at least an hour (--timeout 3600)
> > > > and also couple of hundred times with short timeout. I've used 90
> > > > minutes as mark of "good" run during bisect (start of). I'm not sure
> > > > if this is only one issue here.
> > > >
> > > > I reboot machine with kernel under test. Run uname -r and collect boot
> > > > logs using ssh. And then ssh in with test script. No other commands
> > > > are run.
> > > >
> > > > Some timestamps of errors to give idea how log to wait for test to
> > > > give results. Testing starts when machine has been up about 8 or 9
> > > > seconds.
> > > >
> > > > [ 13.805105] general protection fault: 0000 [#1] SMP PTI
> > > > [ 14.059768] general protection fault: 0000 [#1] SMP PTI
> > > > [ 14.324867] general protection fault: 0000 [#1] SMP PTI
> > > > [ 14.458709] general protection fault: 0000 [#1] SMP PTI
> > > > [ 41.818966] BUG: unable to handle page fault for address: fffff54cf8000028
> > > > [ 105.710330] BUG: unable to handle page fault for address: ffffd2df8a000028
> > > > [ 135.390332] BUG: unable to handle page fault for address: ffffe5a34a000028
> > > > [ 166.793041] BUG: unable to handle page fault for address: ffffd1be6f000028
> > > > [ 311.602285] BUG: unable to handle page fault for address: fffff7f409000028
> > >
> > > > 00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> > > > Subsystem: Red Hat, Inc. QEMU Virtual Machine
> > > > Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> > > > Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > > Kernel modules: intel_agp
> > > >
> > > > 00:01.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card (rev 04) (prog-if 00 [VGA controller])
> > > > Subsystem: Red Hat, Inc. QEMU Virtual Machine
> > > > Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
> > > > Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > > Interrupt: pin A routed to IRQ 21
> > > > Region 0: Memory at f4000000 (32-bit, non-prefetchable) [size=64M]
> > > > Region 1: Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
> > > > Region 2: Memory at fce14000 (32-bit, non-prefetchable) [size=8K]
> > > > Region 3: I/O ports at c040 [size=32]
> > > > Expansion ROM at 000c0000 [disabled] [size=128K]
> > > > Kernel driver in use: qxl
> > > > Kernel modules: qxl
> > > >
> > > > 00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
> > > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > > Latency: 0
> > > > Interrupt: pin A routed to IRQ 22
> > > > Region 0: Memory at fce16000 (32-bit, non-prefetchable) [size=4K]
> > > > Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> > > > I/O behind bridge: 00001000-00001fff [size=4K]
> > > > Memory behind bridge: fcc00000-fcdfffff [size=2M]
> > > > Prefetchable memory behind bridge: 00000000fea00000-00000000febfffff [size=2M]
> > > > Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
> > > > BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
> > > > PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
> > > > Capabilities: [54] Express (v2) Root Port (Slot+), MSI 00
> > > > DevCap: MaxPayload 128 bytes, PhantFunc 0
> > > > ExtTag- RBE+
> > > > DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
> > > > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > > MaxPayload 128 bytes, MaxReadReq 128 bytes
> > > > DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
> > > > LnkCap: Port #16, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <64ns
> > > > ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
> > > > LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
> > > > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> > > > LnkSta: Speed 2.5GT/s (ok), Width x1 (ok)
> > > > TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt-
> > > > SltCap: AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+
> > > > Slot #0, PowerLimit 0.000W; Interlock+ NoCompl-
> > > > SltCtl: Enable: AttnBtn+ PwrFlt- MRL- PresDet- CmdCplt+ HPIrq+ LinkChg-
> > > > Control: AttnInd Off, PwrInd On, Power- Interlock-
> > > > SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
> > > > Changed: MRL- PresDet- LinkState-
> > > > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
> > > > RootCap: CRSVisible-
> > > > RootSta: PME ReqID 0000, PMEStatus- PMEPending-
> > > > DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported ARIFwd+
> > > > AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
> > > > DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
> > > > AtomicOpsCtl: ReqEn- EgressBlck-
> > > > LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> > > > Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> > > > Compliance De-emphasis: -6dB
> > > > LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
> > > > EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> > > > Capabilities: [48] MSI-X: Enable+ Count=1 Masked-
> > > > Vector table: BAR=0 offset=00000000
> > > > PBA: BAR=0 offset=00000800
> > > > Capabilities: [40] Subsystem: Red Hat, Inc. Device 0000
> > > > Capabilities: [100 v2] Advanced Error Reporting
> > > > UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> > > > UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> > > > CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
> > > > CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> > > > AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
> > > > MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
> > > > HeaderLog: 00000000 00000000 00000000 00000000
> > > > RootCmd: CERptEn+ NFERptEn+ FERptEn+
> > > > RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
> > > > FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
> > > > ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
> > > > Kernel driver in use: pcieport
> > > >
> > > > 00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port (prog-if 00 [Normal decode])
> > > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
> > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> > > > La