Re: BUG in mm/zswap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 21, 2020 at 8:30 AM Vitaly Wool <vitaly.wool@xxxxxxxxxxxx> wrote:
On Tue, Apr 21, 2020, 5:19 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
On 4/20/20 1:15 PM, Raymond Jennings wrote:
> I got a bug check and the guys in #kernelnewbies in oftc told me to email you
> guys about it, not sure what to do about it

+CC zswap maintainers

Thanks Vlastimil, I might have a fix for this. I'm going to post a couple of patches this week and I'll make sure you are all CC'd.

~Vitaly

> 2036206:Apr 20 03:22:51 metalhead kernel: [103376.518888] kernel BUG at
> mm/zswap.c:1184!

Hmm that's this:

ret = crypto_comp_decompress(tfm, src, entry->length, dst, &dlen);
put_cpu_ptr(entry->pool->tfm);
kunmap_atomic(dst);
zpool_unmap_handle(entry->pool->zpool, entry->handle);
BUG_ON(ret);

Looks like decompression failed? Are there any messages prior to the BUG that
would indicate the failed decompression?

I don't know, my system went in the pooper after this happened and I got some segfaults later in userspace processes aftewards when I was shutting down for a reboot

>From loaded modules it seems like z3fold and lz4_decompress could be in use
here. What's the output of:
grep . /sys/module/zswap/parameters/*

I don't have this directly anymore, but I was enabling z3fold and I had some rather extreme memory usage going on.  I had the pool size set at 90 percent of total memory which was a whopping 32G of ram.  Possible guess is that there was an allocation failure.  I think the compressor was lzo, and at any rate besides the z3fold and 90 percent things I left all other parameters at their genkernel defaults.

This was a one time fluke and my apologies for not grabbing the parameters when it first happened.
 
And is this reproducible? Or happened just once? Is it a regression after kernel
update?

I don't know, I'm running under Gentoo and using my distro's version of sys-kernel/gentoo-sources-5.6.5 IIRC.
 
> 2036207-Apr 20 03:22:51 metalhead kernel: [103376.518893] invalid opcode: 0000
> [#1] PREEMPT SMP PTI
> 2036208-Apr 20 03:22:51 metalhead kernel: [103376.518895] CPU: 5 PID: 2008 Comm:
> swapoff Not tainted 5.6.5-gentoo-x86_64 #1
> 2036209-Apr 20 03:22:51 metalhead kernel: [103376.518896] Hardware name: Dell
> Inc. OptiPlex 7020/02YYK5, BIOS A15 02/02/2018
> 2036210-Apr 20 03:22:51 metalhead kernel: [103376.518900] RIP:
> 0010:zswap_frontswap_load+0x238/0x250
> 2036211-Apr 20 03:22:51 metalhead kernel: [103376.518901] Code: 00 00 e8 bb 04
> e5 ff 65 8b 05 3c d3 dc 71 85 c0 0f 85 61 ff ff ff e8 3b 74 db ff e9 57 ff ff ff
> e8 31 74 db ff e9 35 ff ff ff <0f> 0b e8 25 74 db ff e9 00 ff ff ff e8 37 13 e2
> ff 0f 1f 80 00 00
> 2036212-Apr 20 03:22:51 metalhead kernel: [103376.518902] RSP:
> 0018:ffffa7ed41f6fb20 EFLAGS: 00010282
> 2036213-Apr 20 03:22:51 metalhead kernel: [103376.518903] RAX: 0000000080000000
> RBX: 00000000ffffffea RCX: 0000000000000000
> 2036214-Apr 20 03:22:51 metalhead kernel: [103376.518904] RDX: 0000000000000001
> RSI: 0000000000000000 RDI: 00000000ffffffff
> 2036215-Apr 20 03:22:51 metalhead kernel: [103376.518905] RBP: ffff8f37e9eab2a0
> R08: ffff8f3a308de780 R09: 0000000000000000
> 2036216-Apr 20 03:22:51 metalhead kernel: [103376.518905] R10: 0000000000000000
> R11: ffffa7ed41f6fb00 R12: ffff8f37bf4e4000
> 2036217-Apr 20 03:22:51 metalhead kernel: [103376.518906] R13: ffff8f3bf6908d28
> R14: ffff8f3bf6908d20 R15: ffff8f3bc7cc5ec8
> 2036218-Apr 20 03:22:51 metalhead kernel: [103376.518907] FS:
>  00007fc9c4cf0780(0000) GS:ffff8f3cfda00000(0000) knlGS:0000000000000000
> 2036219-Apr 20 03:22:51 metalhead kernel: [103376.518908] CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> 2036220-Apr 20 03:22:51 metalhead kernel: [103376.518909] CR2: 00007f9e8a1d471c
> CR3: 0000000300754005 CR4: 00000000001606e0
> 2036221-Apr 20 03:22:51 metalhead kernel: [103376.518909] Call Trace:
> 2036222-Apr 20 03:22:51 metalhead kernel: [103376.518916]
>  __frontswap_load+0x9c/0xf0
> 2036223-Apr 20 03:22:51 metalhead kernel: [103376.518918]  swap_readpage+0xfb/0x330
> 2036224-Apr 20 03:22:51 metalhead kernel: [103376.518920]
>  swap_cluster_readahead+0x1da/0x300
> 2036225-Apr 20 03:22:51 metalhead kernel: [103376.518922]  ? 0xffffffff8e000000
> 2036226-Apr 20 03:22:51 metalhead kernel: [103376.518924]
>  swapin_readahead+0x2e4/0x4a0
> 2036227-Apr 20 03:22:51 metalhead kernel: [103376.518926]  ?
> put_swap_page+0x106/0x310
> 2036228-Apr 20 03:22:51 metalhead kernel: [103376.518928]
>  unuse_pte_range+0x167/0x760
> 2036229-Apr 20 03:22:51 metalhead kernel: [103376.518930]  try_to_unuse+0x5a1/0x730
> 2036230-Apr 20 03:22:51 metalhead kernel: [103376.518932]
>  __do_sys_swapoff+0x1df/0x6d0
> 2036231-Apr 20 03:22:51 metalhead kernel: [103376.518935]  ?
> exit_to_usermode_loop+0x97/0xf0
> 2036232-Apr 20 03:22:51 metalhead kernel: [103376.518937]  do_syscall_64+0x55/0x1b0
> 2036233-Apr 20 03:22:51 metalhead kernel: [103376.518942]
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 2036234-Apr 20 03:22:51 metalhead kernel: [103376.518944] RIP: 0033:0x7fc9c4e25657
> 2036235-Apr 20 03:22:51 metalhead kernel: [103376.518945] Code: 73 01 c3 48 8b
> 0d 39 b8 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
> 44 00 00 b8 a8 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 09 b8 0c 00
> f7 d8 64 89 01 48
> 2036236-Apr 20 03:22:51 metalhead kernel: [103376.518946] RSP:
> 002b:00007ffc715f01e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a8
> 2036237-Apr 20 03:22:51 metalhead kernel: [103376.518947] RAX: ffffffffffffffda
> RBX: 0000000000000000 RCX: 00007fc9c4e25657
> 2036238-Apr 20 03:22:51 metalhead kernel: [103376.518948] RDX: 0000000000000001
> RSI: 0000000000000003 RDI: 0000557e405007b0
> 2036239-Apr 20 03:22:51 metalhead kernel: [103376.518948] RBP: 00007ffc715f1442
> R08: 0000557e404fe580 R09: 0000000000000001
> 2036240-Apr 20 03:22:51 metalhead kernel: [103376.518949] R10: 00007fc9c50018e0
> R11: 0000000000000206 R12: 0000000000000000
> 2036241-Apr 20 03:22:51 metalhead kernel: [103376.518949] R13: 0000557e405007b0
> R14: 0000000000000000 R15: 0000000000000000
> 2036242-Apr 20 03:22:51 metalhead kernel: [103376.518951] Modules linked in:
> z3fold bfq ipt_REJECT nf_reject_ipv4 xt_multiport iptable_filter ip_tables
> af_packet snd_hda_codec_hdmi i915 i2c_algo_bit drm_kms_helper intel_rapl_msr
> intel_rapl_common cec uvcvideo x86_pkg_temp_thermal intel_powerclamp
> snd_hda_codec_generic drm dell_wmi ledtrig_audio videobuf2_vmalloc sparse_keymap
> iTCO_wdt kvm_intel wmi_bmof videobuf2_memops dell_smbios dell_wmi_descriptor
> iTCO_vendor_support snd_hda_intel snd_usb_audio drm_panel_orientation_quirks
> dcdbas snd_usbmidi_lib snd_rawmidi mousedev videobuf2_v4l2 kvm videobuf2_common
> videodev intel_gtt agpgart snd_seq_device irqbypass input_leds joydev
> syscopyarea snd_intel_dspcfg sysfillrect sysimgblt fb_sys_fops binfmt_misc
> crct10dif_pclmul i2c_i801 ghash_clmulni_intel snd_hda_codec i2c_core
> intel_cstate video wmi snd_hwdep intel_uncore snd_hda_core snd_pcm e1000e
> intel_rapl_perf snd_timer snd backlight evbug lpc_ich evdev pcspkr soundcore
> mfd_core coretemp hwmon aesni_intel crypto_simd cryptd glue
> 2036243-Apr 20 03:22:51 metalhead kernel: helper
> 2036244-Apr 20 03:22:51 metalhead kernel: [103376.518983]  algif_rng algif_aead
> algif_hash algif_skcipher af_alg crc32c_intel crc32_pclmul crc32_generic
> configfs overlay squashfs lz4_decompress loop btrfs xor ext4 mbcache jbd2
> raid6_pq libcrc32c dm_snapshot dm_mirror dm_region_hash dm_log_userspace dm_log
> dm_bufio dm_mod firewire_core crc_itu_t hid_generic usbhid ohci_hcd usb_storage
> hid xhci_plat_hcd xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common
> scsi_transport_fc sr_mod cdrom sg sd_mod t10_pi ahci libahci libata scsi_mod
> 2036245-Apr 20 03:22:51 metalhead kernel: [103376.519004] ---[ end trace
> 5959740853c6dbd4 ]---


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux