Re: [Bug 216646] having TRANSPARENT_HUGEPAGE enabled hangs some applications (supervisor read access in kernel mode)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2 Dec 2022 16:58:42 +0000
Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> On Fri, Dec 02, 2022 at 12:56:40PM +0100, Thorsten Leemhuis wrote:
> > On 21.11.22 21:54, Andrew Morton wrote:
> > > On Mon, 21 Nov 2022 18:34:34 +0000 bugzilla-daemon@xxxxxxxxxx wrote:
> > > 
> > >> https://bugzilla.kernel.org/show_bug.cgi?id=216646
> > >>
> > >> --- Comment #6 from Mikhail Pletnev (mmp.dux@xxxxxxxxx) ---
> > >> i can't figure out how to use git send-email, so i will post this here instead
> > >>
> > >>>>> git bisect good 1854bc6e2420472676c5c90d3d6b15f6cd640e40
> > >>
> > >>> I suspect this is where your bisection went astray.  This should have
> > >>> been bad and it led you to the wrong commit.
> > >>
> > >> so i've applied your suggestion and did some more bisecting and arrived at
> > >> this:
> > 
> > That was 793917d997df ("mm/readahead: Add large folio readahead") which
> > is also from Matthew.
> > 
> > > Folks, thanks for continuing to work on this.
> > 
> > Well, since then nothing happened. :-/ Or have I missed something?
> > 
> > Matthew, do you still have this on your radar?
> 
> No, things tend to fall off my radar after a week or two of inactivity.
> Particularly since I went on holiday.  Since I wasn't cc'd on the bug,
> the activity was completely invisible to me.
> 
> Landing on 793917d997df makes a lot more sense.  That's where we
> actually start using large folios.  It doesn't really help narrow
> down the problem.  I have an idea for what it might be; patch to
> try will follow.  But I'll need feedback by email.


Sorry for long absence, here's what i've ended up with (that commit you suspected):
(also seems to fix amdgpu crashes i've been having with newer 5.18+ kernels)

> > > git bisect good 1854bc6e2420472676c5c90d3d6b15f6cd640e40

> I suspect this is where your bisection went astray.  This should have
> been bad and it led you to the wrong commit.

so i've applied your suggestion and did some more bisecting and arrived at this:


git bisect start
git bisect bad be1a63daffdd152ba4c7b71ab9fec2e39259b42b
git bisect good 8bb7eca972ad531c9b149c0a51ab43a417385813
git bisect good fee62ea772040a6b7d5d07d285dcf68f989fc81c
git bisect bad dbe946287e0825f0e9cd4cbeacfcde9d9b2dd168
git bisect bad 25fd2d41b505d0640bdfe67aa77c549de2d3c18a
git bisect bad 182966e1cd74ec0e326cd376de241803ee79741b
git bisect good b080cee72ef355669cbc52ff55dc513d37433600
git bisect good 3fe2f7446f1e029b220f7f650df6d138f91651f2
git bisect bad d51b1b33c51d147b757f042b4d336603b699f362
git bisect good 3bf03b9a0839c9fb06927ae53ebd0f960b19d408
git bisect bad 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a
git bisect good 4aed23a2f8aaaafad0232d3392afcf493c3c3df3
git bisect good ebf55c886eb7fc3c54d02ba1046f0ee38b81fc10
git bisect good d68eccad370665830e16e5c77611fde78cd749b3
git bisect good 3a3bae50af5d73fab5da20484029de77ca67bb2e
git bisect bad 1854bc6e2420472676c5c90d3d6b15f6cd640e40
git bisect good 421f1ab48452af48b64e205de1caca3d1ba415f4
git bisect bad 793917d997df2e432f3e9ac126e4482d68256d01
git bisect good 18788cfa236967741b83db1035ab24539e2a21bb
# first bad commit: [793917d997df2e432f3e9ac126e4482d68256d01] mm/readahead: Add large folio readahead

i verified last two commits over a couple of days to be sure

here's the output of scripts/decode_stacktrace.sh:

RIP: 0010:__filemap_get_folio (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/atomic.h:29 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1158 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1183 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-instrumented.h:608 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:238 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:247 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:280 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:313 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1897 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1949) 
Code: 10 e8 a6 05 68 00 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 3e 02 00 00 a8 01 0f 85 40 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
All code
========
   0:	10 e8                	adc    %ch,%al
   2:	a6                   	cmpsb  %es:(%rdi),%ds:(%rsi)
   3:	05 68 00 48 89       	add    $0x89480068,%eax
   8:	c3                   	ret    
   9:	48 3d 02 04 00 00    	cmp    $0x402,%rax
   f:	74 e2                	je     0xfffffffffffffff3
  11:	48 3d 06 04 00 00    	cmp    $0x406,%rax
  17:	74 da                	je     0xfffffffffffffff3
  19:	48 85 c0             	test   %rax,%rax
  1c:	0f 84 3e 02 00 00    	je     0x260
  22:	a8 01                	test   $0x1,%al
  24:	0f 85 40 02 00 00    	jne    0x26a
  2a:*	8b 40 34             	mov    0x34(%rax),%eax		<-- trapping instruction
  2d:	85 c0                	test   %eax,%eax
  2f:	74 c2                	je     0xfffffffffffffff3
  31:	8d 50 01             	lea    0x1(%rax),%edx
  34:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
  39:	75 f2                	jne    0x2d
  3b:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx

Code starting with the faulting instruction
===========================================
   0:	8b 40 34             	mov    0x34(%rax),%eax
   3:	85 c0                	test   %eax,%eax
   5:	74 c2                	je     0xffffffffffffffc9
   7:	8d 50 01             	lea    0x1(%rax),%edx
   a:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
   f:	75 f2                	jne    0x3
  11:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx
RSP: 0000:ffffa39e84c2bcb0 EFLAGS: 00010246
RAX: 00000000000000c2 RBX: 00000000000000c2 RCX: 0000000000000002
RDX: 0000000000000034 RSI: ffffa39e84c2bcc0 RDI: ffff8c676acf8920
RBP: 0000000000000000 R08: ffffa39e84c2bd40 R09: 0000000000000000
R10: ffffffffffffffc0 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8c675f8ceb78 R14: 00000000005a15b7 R15: fff000003fffffff
FS:  00007f8628ffc640(0000) GS:ffff8c6dbea80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000f6 CR3: 0000000166dda000 CR4: 0000000000750ee0
PKRU: 55555554
Call Trace:
<TASK>
filemap_fault (/home/reinhardt/dev-apps/kernel/linux/./include/linux/pagemap.h:531 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:3105) 
__do_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:3852) 
__handle_mm_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4169 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4297 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4555 /home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4690) 
handle_mm_fault (/home/reinhardt/dev-apps/kernel/linux/mm/memory.c:4788) 
do_user_addr_fault (/home/reinhardt/dev-apps/kernel/linux/./include/linux/sched/signal.h:404 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1399) 
exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/irqflags.h:40 /home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/irqflags.h:75 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1492 /home/reinhardt/dev-apps/kernel/linux/arch/x86/mm/fault.c:1540) 
? asm_exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/idtentry.h:568) 
asm_exc_page_fault (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/idtentry.h:568) 
RIP: 0033:0x7f863519b789
Code: 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 83 fa 20 72 27 <c5> fe 6f 06 48 83 fa 40 0f 87 a9 00 00 00 c5 fe 6f 4c 16 e0 c5 fe
All code
========
   0:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
   7:	00 00 00 00 
   b:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  12:	00 00 00 00 
  16:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  1d:	00 00 00 00 
  21:	48 89 f8             	mov    %rdi,%rax
  24:	48 83 fa 20          	cmp    $0x20,%rdx
  28:	72 27                	jb     0x51
  2a:*	c5 fe 6f 06          	vmovdqu (%rsi),%ymm0		<-- trapping instruction
  2e:	48 83 fa 40          	cmp    $0x40,%rdx
  32:	0f 87 a9 00 00 00    	ja     0xe1
  38:	c5 fe 6f 4c 16 e0    	vmovdqu -0x20(%rsi,%rdx,1),%ymm1
  3e:	c5                   	.byte 0xc5
  3f:	fe                   	.byte 0xfe

Code starting with the faulting instruction
===========================================
   0:	c5 fe 6f 06          	vmovdqu (%rsi),%ymm0
   4:	48 83 fa 40          	cmp    $0x40,%rdx
   8:	0f 87 a9 00 00 00    	ja     0xb7
   e:	c5 fe 6f 4c 16 e0    	vmovdqu -0x20(%rsi,%rdx,1),%ymm1
  14:	c5                   	.byte 0xc5
  15:	fe                   	.byte 0xfe
RSP: 002b:00007f8628ffb888 EFLAGS: 00010202
RAX: 00007f85f00405f0 RBX: 0000000000000000 RCX: 00007f8628ffba10
RDX: 0000000000004000 RSI: 00007f6b803572d5 RDI: 00007f85f00405f0
RBP: 00007f8628ffb8a8 R08: 000000006375fb7a R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 00007f85f000bc00
R13: 00007f8624002130 R14: 00000005a15b72d5 R15: 0000000000004000
</TASK>
Modules linked in: overlay xt_addrtype amdgpu iwlmvm mac80211 libarc4 drm_ttm_helper ttm gpu_sched drm_kms_helper backlight syscopyarea sysfillrect iwlwifi sysimgblt fb_sys_fops i2c_piix4 cfg80211 k10temp fuse configfs efivarfs
CR2: 00000000000000f6
---[ end trace 0000000000000000 ]---
RIP: 0010:__filemap_get_folio (/home/reinhardt/dev-apps/kernel/linux/./arch/x86/include/asm/atomic.h:29 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1158 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-arch-fallback.h:1183 /home/reinhardt/dev-apps/kernel/linux/./include/linux/atomic/atomic-instrumented.h:608 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:238 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:247 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:280 /home/reinhardt/dev-apps/kernel/linux/./include/linux/page_ref.h:313 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1897 /home/reinhardt/dev-apps/kernel/linux/mm/filemap.c:1949) 
Code: 10 e8 a6 05 68 00 48 89 c3 48 3d 02 04 00 00 74 e2 48 3d 06 04 00 00 74 da 48 85 c0 0f 84 3e 02 00 00 a8 01 0f 85 40 02 00 00 <8b> 40 34 85 c0 74 c2 8d 50 01 f0 0f b1 53 34 75 f2 48 8b 54 24 28
All code
========
   0:	10 e8                	adc    %ch,%al
   2:	a6                   	cmpsb  %es:(%rdi),%ds:(%rsi)
   3:	05 68 00 48 89       	add    $0x89480068,%eax
   8:	c3                   	ret    
   9:	48 3d 02 04 00 00    	cmp    $0x402,%rax
   f:	74 e2                	je     0xfffffffffffffff3
  11:	48 3d 06 04 00 00    	cmp    $0x406,%rax
  17:	74 da                	je     0xfffffffffffffff3
  19:	48 85 c0             	test   %rax,%rax
  1c:	0f 84 3e 02 00 00    	je     0x260
  22:	a8 01                	test   $0x1,%al
  24:	0f 85 40 02 00 00    	jne    0x26a
  2a:*	8b 40 34             	mov    0x34(%rax),%eax		<-- trapping instruction
  2d:	85 c0                	test   %eax,%eax
  2f:	74 c2                	je     0xfffffffffffffff3
  31:	8d 50 01             	lea    0x1(%rax),%edx
  34:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
  39:	75 f2                	jne    0x2d
  3b:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx

Code starting with the faulting instruction
===========================================
   0:	8b 40 34             	mov    0x34(%rax),%eax
   3:	85 c0                	test   %eax,%eax
   5:	74 c2                	je     0xffffffffffffffc9
   7:	8d 50 01             	lea    0x1(%rax),%edx
   a:	f0 0f b1 53 34       	lock cmpxchg %edx,0x34(%rbx)
   f:	75 f2                	jne    0x3
  11:	48 8b 54 24 28       	mov    0x28(%rsp),%rdx
RSP: 0000:ffffa39e84c2bcb0 EFLAGS: 00010246
RAX: 00000000000000c2 RBX: 00000000000000c2 RCX: 0000000000000002
RDX: 0000000000000034 RSI: ffffa39e84c2bcc0 RDI: ffff8c676acf8920
RBP: 0000000000000000 R08: ffffa39e84c2bd40 R09: 0000000000000000
R10: ffffffffffffffc0 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8c675f8ceb78 R14: 00000000005a15b7 R15: fff000003fffffff
FS:  00007f8628ffc640(0000) GS:ffff8c6dbea80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000f6 CR3: 0000000166dda000 CR4: 0000000000750ee0
PKRU: 55555554

-- 
Mikhail Pletnev <mmp.dux@xxxxxxxxx>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux