Re: WARNING in af_alg_make_sg

Peter Xu <peterx@xxxxxxxxxx> · Wed, 8 Apr 2020 11:18:15 -0400

On Wed, Apr 08, 2020 at 11:12:13AM -0400, Peter Xu wrote:
> On Wed, Apr 08, 2020 at 05:58:48PM +0800, Hillf Danton wrote:
> > 
> > On Wed, 08 Apr 2020 00:48:13 -0700
> > > syzbot found the following crash on:
> > > 
> > > HEAD commit:    763dede1 Merge tag 'for-linus-5.7-rc1' of git://git.kernel..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=12b919c7e00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=12205d036cec317f
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=3be1a33f04dc782e9fd5
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=142f3b8fe00000
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=159bd23fe00000
> > > 
> > > The bug was bisected to:
> > > 
> > > commit 4426e945df588f2878affddf88a51259200f7e29
> > > Author: Peter Xu <peterx@xxxxxxxxxx>
> > > Date:   Thu Apr 2 04:08:49 2020 +0000
> > > 
> > >     mm/gup: allow VM_FAULT_RETRY for multiple times
> > > 
> > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1408ea9fe00000
> > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1608ea9fe00000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1208ea9fe00000
> > > 
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+3be1a33f04dc782e9fd5@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
> > > 
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 1 PID: 7094 at crypto/af_alg.c:404 af_alg_make_sg+0x399/0x400 crypto/af_alg.c:404
> > > Kernel panic - not syncing: panic_on_warn set ...
> > > CPU: 1 PID: 7094 Comm: syz-executor037 Not tainted 5.6.0-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > Call Trace:
> > >  __dump_stack lib/dump_stack.c:77 [inline]
> > >  dump_stack+0x188/0x20d lib/dump_stack.c:118
> > >  panic+0x2e3/0x75c kernel/panic.c:221
> > >  __warn.cold+0x2f/0x35 kernel/panic.c:582
> > >  report_bug+0x27b/0x2f0 lib/bug.c:195
> > >  fixup_bug arch/x86/kernel/traps.c:175 [inline]
> > >  fixup_bug arch/x86/kernel/traps.c:170 [inline]
> > >  do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
> > >  do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
> > >  invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
> > > RIP: 0010:af_alg_make_sg+0x399/0x400 crypto/af_alg.c:404
> > > Code: 5c 24 2b 31 ff 89 de e8 c5 b9 f8 fd 84 db 74 0e e8 8c b8 f8 fd 48 8b 04 24 48 89 44 24 70 e8 7e b8 f8 fd 0f 0b e8 77 b8 f8 fd <0f> 0b c7 44 24 4c ea ff ff ff e9 4b ff ff ff 48 89 df e8 40 6e 36
> > > RSP: 0018:ffffc900018779a0 EFLAGS: 00010293
> > > RAX: ffff8880a16b65c0 RBX: ffff8880a4141220 RCX: ffffffff837a763d
> > > RDX: 0000000000000000 RSI: ffffffff837a78f9 RDI: 0000000000000005
> > > RBP: 000000001fef2254 R08: ffff8880a16b65c0 R09: ffffed10142d6cb9
> > > R10: ffff8880a16b65c7 R11: ffffed10142d6cb8 R12: 0000000000000000
> > > R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
> > >  hash_sendmsg+0x45c/0xad0 crypto/algif_hash.c:94
> > >  sock_sendmsg_nosec net/socket.c:652 [inline]
> > >  sock_sendmsg+0xcf/0x120 net/socket.c:672
> > >  ____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
> > >  ___sys_sendmsg+0x100/0x170 net/socket.c:2416
> > >  __sys_sendmsg+0xec/0x1b0 net/socket.c:2449
> > >  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xb3
> > 
> > Make gup feed back correct error code in case of bailout.
> > 
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -1325,8 +1325,11 @@ retry:
> >  		 * start trying again otherwise it can loop forever.
> >  		 */
> >  
> > -		if (fatal_signal_pending(current))
> > +		if (fatal_signal_pending(current)) {
> > +			if (!pages_done)
> > +				pages_done = -EINTR;
> >  			break;
> > +		}
> >  
> >  		*locked = 1;
> >  		down_read(&mm->mmap_sem);
> > 
> 
> CC Thomas too.
> 
> Sorry for all these mess...
> 
> Frankly speaking I didn't notice get_user_pages_fast() forbids
> returning zero while __get_user_pages() allowed it...  Ideally I think
> the gup callers should check against ret>0 to know exactly how many
> valid pages we've got, but it's not an excuse good enough...
> 
> Hillf, would you mind kick the syzbot directly next time when post the
> fix?  I'll make bold to do that for you this time, Thanks!

I got one extra "#"... Doing it again...

#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

8<--------------------------------------------------------------------
>From 380003a56efc125565143c91ee6cefd7b3eba869 Mon Sep 17 00:00:00 2001
From: Hillf Danton <hdanton@xxxxxxxx>
Date: Wed, 8 Apr 2020 11:01:25 -0400
Subject: [PATCH] mm/gup: Let __get_user_pages_locked() return -EINTR for fatal
 signal

__get_user_pages_locked() will return 0 instead of -EINTR after commit
4426e945df588 which added extra code to allow gup detect fatal signal
faster.  Restore that behavior.

CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Hillf Danton <hdanton@xxxxxxxx>
[peterx: write commit message]
Signed-off-by: Peter Xu <peterx@xxxxxxxxxx>
---
 mm/gup.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index afce0bc47e70..6076df8e04a4 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1326,8 +1326,11 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
 		 * start trying again otherwise it can loop forever.
 		 */
 
-		if (fatal_signal_pending(current))
+		if (fatal_signal_pending(current)) {
+			if (!pages_done)
+				pages_done = -EINTR;
 			break;
+		}
 
 		ret = down_read_killable(&mm->mmap_sem);
 		if (ret) {
-- 
2.24.1

-- 
Peter Xu