* Bert Karwatzki <spasswolf@xxxxxx> [240703 06:14]: > If been running the fixed patchset on top of linux-next for a few days now, so far > without error but then I ran into this. After this the system went into a kernel panic > (freeze, flashing capslock) this is the last message preserverd in /var/log/kern.log. I > tried to emergency sync using magic sysrq, but that did not work so the actual panic message > is lost. I do not know how (or if) this is related to the patch set. Thanks. Is this the first error printed in the kern.log? > The kernel used is linux-next-20240703 plus your v2 patchset plus one additional unrelated patch > (just #ifdef CONFIG_OF in drivers/pci/bus.c related to > https://lore.kernel.org/all/20240612082019.19161-4-brgl@xxxxxxxx/#t). It could very likely be me, but I don't know how the count would have errors so late in the process life cycle - do you have CONFIG_DEBUG_VM_MAPLE_TREE enabled? This would check the count against the tree on modifications and would narrow down where things could have gone wrong. > > > [ T8516] show_signal_msg: 16 callbacks suppressed > [ T8516] Isolated Web Co[8516]: segfault at 0 ip 00007f8c1f55fbe5 sp 00007ffcc2b97660 error 6 in libxul.so[4f98be5,7f8c1a686000+5f96000] likely on CPU 14 (core 7, socket 0) This is firefox again. > [ T8516] Code: 48 8d 0d 63 a3 3c 01 48 89 08 c7 04 25 00 00 00 00 00 00 00 00 0f 0b 48 8b 05 47 1a e2 03 48 8d 0d 38 99 30 01 48 89 08 31 c0 <89> 04 25 00 00 00 00 0f 0b e8 7d 7a 12 fb 66 2e 0f 1f 84 00 00 00 > [ T8521] ------------[ cut here ]------------ > [ T8521] kernel BUG at mm/mmap.c:3521! This is failing because the munmap count != mm->map_count on exit_mmap(). > [ T8521] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > [ T8521] CPU: 6 UID: 1000 PID: 8521 Comm: Socket Thread Not tainted 6.10.0-rc6-next-20240703-00016-g09a756327684 #1416 > [ T8521] Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > [ T8521] RIP: 0010:exit_mmap+0x28c/0x2a0 > [ T8521] Code: f7 45 31 ed e8 f5 7e e9 ff 4c 89 f7 e8 3d f4 6c 00 e9 63 ff ff ff 48 89 ef e8 70 11 04 00 e9 d1 fd ff ff 0f 0b e9 66 ff ff ff <0f> 0b e8 8d 0a 6c 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 66 0f > [ T8521] RSP: 0018:ffffac759002fca0 EFLAGS: 00010297 > [ T8521] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > [ T8521] RDX: 0000000000000001 RSI: ffff8bb7caca9700 RDI: ffff8bb7caca9708 > [ T8521] RBP: ffff8bb7c237e900 R08: 0000000000000000 R09: 000000000000000f > [ T8521] R10: 00007ffcc2b9bfff R11: 0000000000000078 R12: 00000000000006b5 > [ T8521] R13: fffffffffffeb575 R14: ffff8bb7c237e9a8 R15: ffff8bb7c237e940 > [ T8521] FS: 0000000000000000(0000) GS:ffff8bc6adf80000(0000) knlGS:0000000000000000 > [ T8521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ T8521] CR2: 00007f2b6cae2000 CR3: 0000000fd1c18000 CR4: 0000000000750ef0 > [ T8521] PKRU: 55555554 > [ T8521] Call Trace: > [ T8521] <TASK> > [ T8521] ? die+0x31/0x80 > [ T8521] ? do_trap+0xf1/0x100 > [ T8521] ? exit_mmap+0x28c/0x2a0 > [ T8521] ? do_error_trap+0x60/0x80 > [ T8521] ? exit_mmap+0x28c/0x2a0 > [ T8521] ? exc_invalid_op+0x4d/0x70 > [ T8521] ? exit_mmap+0x28c/0x2a0 > [ T8521] ? asm_exc_invalid_op+0x1a/0x20 > [ T8521] ? exit_mmap+0x28c/0x2a0 > [ T8521] ? mmput+0x50/0x120 > [ T8521] ? do_exit+0x285/0x9c0 > [ T8521] ? do_group_exit+0x2b/0x80 > [ T8521] ? get_signal+0x731/0x7e0 > [ T8521] ? arch_do_signal_or_restart+0x29/0x230 > [ T8521] ? srso_alias_return_thunk+0x5/0xfbef5 > [ T8521] ? srso_alias_return_thunk+0x5/0xfbef5 > [ T8521] ? __x64_sys_futex+0x109/0x1c0 > [ T8521] ? syscall_exit_to_user_mode+0x154/0x1a0 > [ T8521] ? do_syscall_64+0x6b/0x170 > [ T8521] ? entry_SYSCALL_64_after_hwframe+0x55/0x5d > [ T8521] </TASK> > [ T8521] Modules linked in: ccm snd_seq_dummy snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device rfcomm cmac bnep nls_ascii nls_cp437 vfat fat snd_ctl_led btusb snd_hda_codec_realtek btrtl btintel snd_hda_codec_generic btbcm btmtk snd_hda_scodec_component snd_hda_codec_hdmi bluetooth snd_hda_intel snd_intel_dspcfg uvcvideo snd_hda_codec videobuf2_vmalloc snd_acp3x_pdm_dma snd_soc_dmic snd_hwdep snd_acp3x_rn uvc amd_atl videobuf2_memops snd_soc_core snd_hda_core videobuf2_v4l2 snd_pcm_oss videodev snd_mixer_oss snd_pcm snd_rn_pci_acp3x videobuf2_common snd_acp_config snd_timer snd_soc_acpi msi_wmi ecdh_generic ecc sparse_keymap mc edac_mce_amd wmi_bmof snd ccp k10temp joydev soundcore snd_pci_acp3x battery ac button hid_sensor_gyro_3d hid_sensor_prox hid_sensor_als hid_sensor_magn_3d hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf industrialio amd_pmc hid_sensor_iio_common evdev hid_multitouch serio_raw mt7921e mt7921_common mt792x_lib mt76_connac_lib mt76 > [ T8521] mac80211 libarc4 cfg80211 rfkill msr fuse nvme_fabrics efi_pstore configfs efivarfs autofs4 ext4 crc32c_generic mbcache jbd2 usbhid amdgpu i2c_algo_bit drm_ttm_helper xhci_pci ttm drm_exec drm_suballoc_helper xhci_hcd amdxcp drm_buddy hid_sensor_hub usbcore nvme gpu_sched mfd_core hid_generic crc32c_intel psmouse i2c_piix4 amd_sfh drm_display_helper usb_common nvme_core crc16 r8169 i2c_hid_acpi i2c_hid hid i2c_designware_platform i2c_designware_core > [ T8521] ---[ end trace 0000000000000000 ]--- > [ T8521] RIP: 0010:exit_mmap+0x28c/0x2a0 > [ T8521] Code: f7 45 31 ed e8 f5 7e e9 ff 4c 89 f7 e8 3d f4 6c 00 e9 63 ff ff ff 48 89 ef e8 70 11 04 00 e9 d1 fd ff ff 0f 0b e9 66 ff ff ff <0f> 0b e8 8d 0a 6c 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 66 0f > [ T8521] RSP: 0018:ffffac759002fca0 EFLAGS: 00010297 > [ T8521] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 > [ T8521] RDX: 0000000000000001 RSI: ffff8bb7caca9700 RDI: ffff8bb7caca9708 > [ T8521] RBP: ffff8bb7c237e900 R08: 0000000000000000 R09: 000000000000000f > [ T8521] R10: 00007ffcc2b9bfff R11: 0000000000000078 R12: 00000000000006b5 > [ T8521] R13: fffffffffffeb575 R14: ffff8bb7c237e9a8 R15: ffff8bb7c237e940 > [ T8521] FS: 0000000000000000(0000) GS:ffff8bc6adfc0000(0000) knlGS:0000000000000000 > [ T8521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ T8521] CR2: 00007f460adce000 CR3: 0000000fd1c18000 CR4: 0000000000750ef0 > [ T8521] PKRU: 55555554 > [ T8521] Fixing recursive fault but reboot is needed! > [ T8521] BUG: scheduling while atomic: Socket Thread/8521/0x00000000 > [ T8521] Modules linked in: ccm snd_seq_dummy snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device rfcomm cmac bnep nls_ascii nls_cp437 vfat fat snd_ctl_led btusb snd_hda_codec_realtek btrtl btintel snd_hda_codec_generic btbcm btmtk snd_hda_scodec_component snd_hda_codec_hdmi bluetooth snd_hda_intel snd_intel_dspcfg uvcvideo snd_hda_codec videobuf2_vmalloc snd_acp3x_pdm_dma snd_soc_dmic snd_hwdep snd_acp3x_rn uvc amd_atl videobuf2_memops snd_soc_core snd_hda_core videobuf2_v4l2 snd_pcm_oss videodev snd_mixer_oss snd_pcm snd_rn_pci_acp3x videobuf2_common snd_acp_config snd_timer snd_soc_acpi msi_wmi ecdh_generic ecc sparse_keymap mc edac_mce_amd wmi_bmof snd ccp k10temp joydev soundcore snd_pci_acp3x battery ac button hid_sensor_gyro_3d hid_sensor_prox hid_sensor_als hid_sensor_magn_3d hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf industrialio amd_pmc hid_sensor_iio_common evdev hid_multitouch serio_raw mt7921e mt7921_common mt792x_lib mt76_connac_lib mt76 > [ T8521] mac80211 libarc4 cfg80211 rfkill msr fuse nvme_fabrics efi_pstore configfs efivarfs autofs4 ext4 crc32c_generic mbcache jbd2 usbhid amdgpu i2c_algo_bit drm_ttm_helper xhci_pci ttm drm_exec drm_suballoc_helper xhci_hcd amdxcp drm_buddy hid_sensor_hub usbcore nvme gpu_sched mfd_core hid_generic crc32c_intel psmouse i2c_piix4 amd_sfh drm_display_helper usb_common nvme_core crc16 r8169 i2c_hid_acpi i2c_hid hid i2c_designware_platform i2c_designware_core > [ T8521] CPU: 7 UID: 1000 PID: 8521 Comm: Socket Thread Tainted: G D 6.10.0-rc6-next-20240703-00016-g09a756327684 #1416 > [ T8521] Tainted: [D]=DIE > [ T8521] Hardware name: Micro-Star International Co., Ltd. Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021 > [ T8521] Call Trace: > [ T8521] <TASK> > [ T8521] ? dump_stack_lvl+0x53/0x70 > [ T8521] ? __schedule_bug+0x4d/0x60 > [ T8521] ? __schedule+0x734/0x800 > [ T8521] ? srso_alias_return_thunk+0x5/0xfbef5 > [ T8521] ? _printk+0x57/0x80 > [ T8521] ? do_task_dead+0x3d/0x40 > [ T8521] ? make_task_dead+0x13b/0x160 > [ T8521] ? rewind_stack_and_make_dead+0x16/0x20 > [ T8521] </TASK> It looks like it might have been in the process of killing the thread group (terminating firefox ?) for another reason? Considering that I modify the count under the mmap_lock in write mode and that the map_count is verified in validate_mm(), I am eager to find out if you are running with CONFIG_DEBUG_VM_MAPLE_TREE enabled. Thanks, Liam