Re: [Bug 216646] having TRANSPARENT_HUGEPAGE enabled hangs some applications (supervisor read access in kernel mode)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I did some more testing on v6.1.12 and reproduced the issue. But i have new bit of information: since the last time i've seen this issue i've migrated most of my storage from XFS to BTRFS and i couldn't reproduce the issue again today until i switched the source volume in the test back to XFS. So it seems bug is either in the way that XFS talks to mm/folios or is just triggered by it.

anyway, i attached a report from v6.1.2 (seems to be happening in the same place)


On 2/24/23 13:21, Linux regression tracking (Thorsten Leemhuis) wrote:
On 16.12.22 06:23, Thorsten Leemhuis wrote:
Hi, this is your Linux kernel regression tracker. Top-posting for once,
to make this easily accessible to everyone.
/me again

Was some progress made to get this regression resolved? From here it
looks kinda stalled, that's why I'm asking -- but maybe I just missed
something.
Did anything happen to get this regression resolved? Doesn't look like
it, but maybe I missed some progress.

Willy, Mikhail confirmed off-list to me that the problem still exists.
He also tried you patch and reported back. Is there something else you need?

Side note: I lost this out of sight during the festive season and should
have asked this earlier, but better late than never. :-D

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 06.12.22 03:08, Mikhail Pletnev wrote:
On Mon, 5 Dec 2022 20:25:11 +0000
Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
Thanks!  I think this may be the problem ...

Hi Matthew, thanks for swift response, i've applied your last patch and ran my stress test a couple of times. It's still constistently crashing (albeit it seems in a different place):

[ 1975.257126] ***BAD SIBLING*** index 912583 offset 4
[ 1975.257128] node ffff9fc817e01ff0 offset 51 parent ffff9fc5c7a31ff0 shift 0 count 64 values 48 array ffff9fc521173e80 list ffff9fc817e02008 ffff9fc817e02008 marks 0 0 0
[ 1975.257133] BUG: kernel NULL pointer dereference, address: 0000000000000036
[ 1975.257135] #PF: supervisor read access in kernel mode
[ 1975.257137] #PF: error_code(0x0000) - not-present page
[ 1975.257138] PGD 0 P4D 0
[ 1975.257139] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 1975.257141] CPU: 5 PID: 8303 Comm: deluge-gtk Not tainted 5.17.0-rc4_ap_test-00163-g793917d997df-dirty #6
[ 1975.257144] Hardware name: Micro-Star International Co., Ltd. MS-7C35/MEG X570 UNIFY (MS-7C35), BIOS A.C3 03/15/2022
[ 1975.257146] RIP: 0010:__filemap_get_folio (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/atomic.h:29 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1158 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1183 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-instrumented.h:608 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:238 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:247 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:280 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:313 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1899 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1951)
[ 1975.257152] Code: 10 e8 56 fd 67 00 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 3e 02 00 00 a8 01 0f 85 40 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
All code
========
    0:	10 e8                	adc    %ch,%al
    2:	56                   	push   %rsi
    3:	fd                   	std
    4:	67 00 48 89          	add    %cl,-0x77(%eax)
    8:	c3                   	ret
    9:	48 3d 02 04 00 00    	cmp    $0x402,%rax
    f:	74 e2                	je     0xfffffffffffffff3
   11:	48 3d 06 04 00 00    	cmp    $0x406,%rax
   17:	74 da                	je     0xfffffffffffffff3
   19:	48 85 c0             	test   %rax,%rax
   1c:	0f 84 3e 02 00 00    	je     0x260
   22:	a8 01                	test   $0x1,%al
   24:	0f 85 40 02 00 00    	jne    0x26a
   2a:*	8b 40 34             	mov    0x34(%rax),%eax		<-- trapping instruction
   2d:	85 c0                	test   %eax,%eax
   2f:	74 c2                	je     0xfffffffffffffff3
   31:	8d 50 01             	lea    0x1(%rax),%edx
   34:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
   39:	75 f2                	jne    0x2d
   3b:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx

Code starting with the faulting instruction
===========================================
    0:	8b 40 34             	mov    0x34(%rax),%eax
    3:	85 c0                	test   %eax,%eax
    5:	74 c2                	je     0xffffffffffffffc9
    7:	8d 50 01             	lea    0x1(%rax),%edx
    a:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
    f:	75 f2                	jne    0x3
   11:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx
[ 1975.257154] RSP: 0000:ffffc2d744c37cb0 EFLAGS: 00010246
[ 1975.257155] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000
[ 1975.257156] RDX: 0000000000000000 RSI: ffffffffbb117459 RDI: 00000000ffffffff
[ 1975.257157] RBP: 0000000000000000 R08: 00000000ffffdfff R09: 00000000ffffdfff
[ 1975.257158] R10: ffffffffbb472dc0 R11: ffffffffbb472dc0 R12: 0000000000000000
[ 1975.257159] R13: ffff9fc521173e78 R14: 00000000000decc7 R15: fff000003fffffff
[ 1975.257160] FS:  00007fb2137fe6c0(0000) GS:ffff9fcb7eb40000(0000) knlGS:0000000000000000
[ 1975.257161] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1975.257162] CR2: 0000000000000036 CR3: 0000000164114000 CR4: 0000000000750ee0
[ 1975.257163] PKRU: 55555554
[ 1975.257163] Call Trace:
[ 1975.257164]  <TASK>
[ 1975.257166] ? page_add_file_rmap (/home/reinhardt/dev-apps/kernel/linux/./include/linux/page-flags.h:195 /home/reinhardt/dev-apps/kernel/linux/mm/internal.h:440 /home/reinhardt/dev-apps/kernel/linux/mm/rmap.c:1270)
[ 1975.257169] filemap_fault (/home/reinhardt/dev-apps/kernel/linux/./include/linux/pagemap.h:531 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:3107)
[ 1975.257172] __do_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:3852)
[ 1975.257174] __handle_mm_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4169 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4297 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4555 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4690)
[ 1975.257176] handle_mm_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4788)
[ 1975.257178] do_user_addr_fault (/home/reinhardt/dev-apps/kernel/linux/./include/linux/sched/signal.h:404 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1399)
[ 1975.257181] exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/irqflags.h:40 /home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/irqflags.h:75 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1492 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1540)
[ 1975.257184] ? asm_exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/idtentry.h:568)
[ 1975.257186] asm_exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/idtentry.h:568)
[ 1975.257188] RIP: 0033:0x7fb265b88409
[ 1975.257189] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 83 fa 20 72 27 <c5> fe 6f 06 48 83 fa 40 0f 87 a9 00 00 00 c5 fe 6f 4c 16 e0 c5 fe
All code
========
    0:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
    7:	00 00 00 00
    b:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
   12:	00 00 00 00
   16:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
   1d:	00 00 00 00
   21:	48 89 f8             	mov    %rdi,%rax
   24:	48 83 fa 20          	cmp    $0x20,%rdx
   28:	72 27                	jb     0x51
   2a:*	c5 fe 6f 06          	vmovdqu (%rsi),%ymm0		<-- trapping instruction
   2e:	48 83 fa 40          	cmp    $0x40,%rdx
   32:	0f 87 a9 00 00 00    	ja     0xe1
   38:	c5 fe 6f 4c 16 e0    	vmovdqu -0x20(%rsi,%rdx,1),%ymm1
   3e:	c5                   	.byte 0xc5
   3f:	fe                   	.byte 0xfe

Code starting with the faulting instruction
===========================================
    0:	c5 fe 6f 06          	vmovdqu (%rsi),%ymm0
    4:	48 83 fa 40          	cmp    $0x40,%rdx
    8:	0f 87 a9 00 00 00    	ja     0xb7
    e:	c5 fe 6f 4c 16 e0    	vmovdqu -0x20(%rsi,%rdx,1),%ymm1
   14:	c5                   	.byte 0xc5
   15:	fe                   	.byte 0xfe
[ 1975.257190] RSP: 002b:00007fb2137fd908 EFLAGS: 00010202
[ 1975.257191] RAX: 00007fb204012a80 RBX: 0000000000000000 RCX: 00007fb2137fda90
[ 1975.257192] RDX: 0000000000004000 RSI: 00007f9fddbb51c3 RDI: 00007fb204012a80
[ 1975.257193] RBP: 00007fb2137fd928 R08: 00000000638ea1ab R09: 0000000000000000
[ 1975.257193] R10: 0000000000000008 R11: 0000000000000246 R12: 00007fb204000bb0
[ 1975.257194] R13: 00007fb21809a5a0 R14: 00000000decc71c3 R15: 0000000000004000
[ 1975.257196]  </TASK>
[ 1975.257196] Modules linked in: overlay xt_addrtype amdgpu drm_ttm_helper ttm gpu_sched drm_kms_helper iwlmvm backlight syscopyarea mac80211 sysfillrect sysimgblt libarc4 fb_sys_fops iwlwifi cfg80211 i2c_piix4 k10temp fuse configfs efivarfs
[ 1975.257207] CR2: 0000000000000036
[ 1975.257208] ---[ end trace 0000000000000000 ]---
[ 1975.257209] RIP: 0010:__filemap_get_folio (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/atomic.h:29 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1158 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1183 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-instrumented.h:608 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:238 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:247 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:280 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:313 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1899 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1951)
[ 1975.257211] Code: 10 e8 56 fd 67 00 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 3e 02 00 00 a8 01 0f 85 40 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
All code
========
    0:	10 e8                	adc    %ch,%al
    2:	56                   	push   %rsi
    3:	fd                   	std
    4:	67 00 48 89          	add    %cl,-0x77(%eax)
    8:	c3                   	ret
    9:	48 3d 02 04 00 00    	cmp    $0x402,%rax
    f:	74 e2                	je     0xfffffffffffffff3
   11:	48 3d 06 04 00 00    	cmp    $0x406,%rax
   17:	74 da                	je     0xfffffffffffffff3
   19:	48 85 c0             	test   %rax,%rax
   1c:	0f 84 3e 02 00 00    	je     0x260
   22:	a8 01                	test   $0x1,%al
   24:	0f 85 40 02 00 00    	jne    0x26a
   2a:*	8b 40 34             	mov    0x34(%rax),%eax		<-- trapping instruction
   2d:	85 c0                	test   %eax,%eax
   2f:	74 c2                	je     0xfffffffffffffff3
   31:	8d 50 01             	lea    0x1(%rax),%edx
   34:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
   39:	75 f2                	jne    0x2d
   3b:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx

Code starting with the faulting instruction
===========================================
    0:	8b 40 34             	mov    0x34(%rax),%eax
    3:	85 c0                	test   %eax,%eax
    5:	74 c2                	je     0xffffffffffffffc9
    7:	8d 50 01             	lea    0x1(%rax),%edx
    a:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
    f:	75 f2                	jne    0x3
   11:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx
[ 1975.257212] RSP: 0000:ffffc2d744c37cb0 EFLAGS: 00010246
[ 1975.257213] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000
[ 1975.257214] RDX: 0000000000000000 RSI: ffffffffbb117459 RDI: 00000000ffffffff
[ 1975.257215] RBP: 0000000000000000 R08: 00000000ffffdfff R09: 00000000ffffdfff
[ 1975.257215] R10: ffffffffbb472dc0 R11: ffffffffbb472dc0 R12: 0000000000000000
[ 1975.257216] R13: ffff9fc521173e78 R14: 00000000000decc7 R15: fff000003fffffff
[ 1975.257217] FS:  00007fb2137fe6c0(0000) GS:ffff9fcb7eb40000(0000) knlGS:0000000000000000
[ 1975.257218] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1975.257219] CR2: 0000000000000036 CR3: 0000000164114000 CR4: 0000000000750ee0
[ 1975.257220] PKRU: 55555554

(full dmesg and my local changeset in attachments for your reference)

#regzbot poke
[  862.914175] BUG: kernel NULL pointer dereference, address: 00000000000000b6
[  862.914181] #PF: supervisor read access in kernel mode
[  862.914182] #PF: error_code(0x0000) - not-present page
[  862.914184] PGD 0 P4D 0
[  862.914186] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  862.914188] CPU: 6 PID: 8272 Comm: deluge-gtk Not tainted 6.1.12-gentoo_ap #3
[  862.914190] Hardware name: Micro-Star International Co., Ltd. MS-7C35/MEG X570 UNIFY (MS-7C35), BIOS A.C3 03/15/2022
[  862.914191] RIP: 0010:__filemap_get_folio+0xa7/0x370
[  862.914195] Code: 10 e8 6d 41 2b 01 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 56 02 00 00 a8 01 0f 85 58 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
[  862.914197] RSP: 0000:ffffbd6d877bfc88 EFLAGS: 00010246
[  862.914198] RAX: 0000000000000082 RBX: 0000000000000082 RCX: 0000000000000002
[  862.914199] RDX: 0000000000000028 RSI: ffff9512dabc0248 RDI: ffffbd6d877bfc98
[  862.914200] RBP: 0000000000000000 R08: 000000000053252f R09: 0000000000000000
[  862.914201] R10: ffffffffffffffc0 R11: ffff950ff56e630c R12: 0000000000000000
[  862.914202] R13: ffff9510a09646b0 R14: 000000000053252a R15: fff000003fffffff
[  862.914204] FS:  00007f5d637fe6c0(0000) GS:ffff9516beb80000(0000) knlGS:0000000000000000
[  862.914205] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  862.914206] CR2: 00000000000000b6 CR3: 0000000154bf4000 CR4: 0000000000750ee0
[  862.914207] PKRU: 55555554
[  862.914208] Call Trace:
[  862.914210]  <TASK>
[  862.914212]  ? _raw_spin_unlock+0x10/0x30
[  862.914215]  filemap_fault+0x60/0x900
[  862.914217]  __do_fault+0x30/0xb0
[  862.914220]  __handle_mm_fault+0xca3/0x16c0
[  862.914222]  handle_mm_fault+0xe9/0x2e0
[  862.914224]  do_user_addr_fault+0x1b7/0x650
[  862.914226]  exc_page_fault+0x60/0x130
[  862.914229]  asm_exc_page_fault+0x22/0x30
[  862.914231] RIP: 0033:0x7f5dcd376409
[  862.914232] Code: 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 83 fa 20 72 27 <c5> fe 6f 06 48 83 fa 40 0f 87 a9 00 00 00 c5 fe 6f 4c 16 e0 c5 fe
[  862.914233] RSP: 002b:00007f5d637fd908 EFLAGS: 00010202
[  862.914234] RAX: 00007f5d94029660 RBX: 0000000000000000 RCX: 00007f5d637fda90
[  862.914235] RDX: 0000000000004000 RSI: 00007f1bac52a66a RDI: 00007f5d94029660
[  862.914236] RBP: 00007f5d637fd928 R08: 0000000063f8fa01 R09: 0000000000000000
[  862.914237] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f5d94004c20
[  862.914237] R13: 00007f5d9c00a0d0 R14: 000000053252a66a R15: 0000000000004000
[  862.914239]  </TASK>
[  862.914240] Modules linked in: overlay xt_addrtype iwlmvm mac80211 libarc4 i2c_piix4 iwlwifi tpm_crb cfg80211 tpm_tis tpm_tis_core tpm k10temp fuse configfs efivarfs
[  862.914247] CR2: 00000000000000b6
[  862.914248] ---[ end trace 0000000000000000 ]---
[  862.914249] RIP: 0010:__filemap_get_folio+0xa7/0x370
[  862.914251] Code: 10 e8 6d 41 2b 01 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 56 02 00 00 a8 01 0f 85 58 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
[  862.914252] RSP: 0000:ffffbd6d877bfc88 EFLAGS: 00010246
[  862.914253] RAX: 0000000000000082 RBX: 0000000000000082 RCX: 0000000000000002
[  862.914255] RDX: 0000000000000028 RSI: ffff9512dabc0248 RDI: ffffbd6d877bfc98
[  862.914256] RBP: 0000000000000000 R08: 000000000053252f R09: 0000000000000000
[  862.914257] R10: ffffffffffffffc0 R11: ffff950ff56e630c R12: 0000000000000000
[  862.914258] R13: ffff9510a09646b0 R14: 000000000053252a R15: fff000003fffffff
[  862.914260] FS:  00007f5d637fe6c0(0000) GS:ffff9516beb80000(0000) knlGS:0000000000000000
[  862.914262] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  862.914262] CR2: 00000000000000b6 CR3: 0000000154bf4000 CR4: 0000000000750ee0
[  862.914263] PKRU: 55555554

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux