Re: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit

Marco Elver <elver@xxxxxxxxxx> · Tue, 14 Apr 2020 19:49:57 +0200

On Wed, 1 Apr 2020 at 18:20, Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> On Wed, Apr 01, 2020 at 12:24:01PM +0200, Marco Elver wrote:
> > On Wed, 1 Apr 2020 at 09:04, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Mar 31, 2020 at 10:27 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> > > >
> > > > On Tue, Mar 31, 2020 at 12:35:13PM -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    b12d66a6 mm, kcsan: Instrument SLAB free with ASSERT_EXCLU..
> > > > > git tree:       https://github.com/google/ktsan.git kcsan
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=111f0865e00000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=10bc0131c4924ba9
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a6bca8169ffda8ce77b
> > > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this crash yet.
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+6a6bca8169ffda8ce77b@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > >
> > > > > ==================================================================
> > > > > BUG: KCSAN: data-race in glue_cbc_decrypt_req_128bit / glue_cbc_decrypt_req_128bit
> > > > >
> > > > > write to 0xffff88809966e128 of 8 bytes by task 24119 on cpu 0:
> > > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > > >  glue_cbc_decrypt_req_128bit+0x396/0x460 arch/x86/crypto/glue_helper.c:144
> > > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > > >  sock_recvmsg net/socket.c:904 [inline]
> > > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > read to 0xffff88809966e128 of 8 bytes by task 24118 on cpu 1:
> > > > >  u128_xor include/crypto/b128ops.h:67 [inline]
> > > > >  glue_cbc_decrypt_req_128bit+0x37c/0x460 arch/x86/crypto/glue_helper.c:144
> > > > >  cbc_decrypt+0x26/0x40 arch/x86/crypto/serpent_avx2_glue.c:152
> > > > >  crypto_skcipher_decrypt+0x65/0x90 crypto/skcipher.c:652
> > > > >  _skcipher_recvmsg crypto/algif_skcipher.c:142 [inline]
> > > > >  skcipher_recvmsg+0x7fa/0x8c0 crypto/algif_skcipher.c:161
> > > > >  skcipher_recvmsg_nokey+0x5e/0x80 crypto/algif_skcipher.c:279
> > > > >  sock_recvmsg_nosec net/socket.c:886 [inline]
> > > > >  sock_recvmsg net/socket.c:904 [inline]
> > > > >  sock_recvmsg+0x92/0xb0 net/socket.c:900
> > > > >  ____sys_recvmsg+0x167/0x3a0 net/socket.c:2566
> > > > >  ___sys_recvmsg+0xb2/0x100 net/socket.c:2608
> > > > >  __sys_recvmsg+0x9d/0x160 net/socket.c:2642
> > > > >  __do_sys_recvmsg net/socket.c:2652 [inline]
> > > > >  __se_sys_recvmsg net/socket.c:2649 [inline]
> > > > >  __x64_sys_recvmsg+0x51/0x70 net/socket.c:2649
> > > > >  do_syscall_64+0xcc/0x3a0 arch/x86/entry/common.c:294
> > > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > >
> > > > > Reported by Kernel Concurrency Sanitizer on:
> > > > > CPU: 1 PID: 24118 Comm: syz-executor.1 Not tainted 5.6.0-rc1-syzkaller #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > > ==================================================================
> > > > >
> > > >
> > > > I think this is a problem for almost all the crypto code.  Due to AF_ALG, both
> > > > the source and destination buffers can be userspace pages that were gotten with
> > > > get_user_pages().  Such pages can be concurrently modified, not just by the
> > > > kernel but also by userspace.
> > > >
> > > > I'm not sure what can be done about this.
> > >
> > > Oh, I thought it's something more serious like a shared crypto object.
> > > Thanks for debugging.
[...]
> > >
> > > Marco, I think we need to ignore all memory that comes from
> > > get_user_pages() somehow. Either not set watchpoints at all, or
> > > perhaps filter them out later if the check is not totally free.
> >
> > Makes sense. We already have similar checks, and they're in the
> > slow-path, so it shouldn't be a problem. Let me investigate.
>
> I'm wondering whether you really should move so soon to ignoring these races?
> They are still races; the crypto code is doing standard unannotated reads/writes
> of memory that can be concurrently modified.
>
[...]

Wanted to follow up on this, just to clarify: The issue here
essentially boils down to a user-space race involving an API that
isn't designed to be thread-safe with the provided arguments (pointer
to same user-space memory). The data race here merely manifests in
kernel code, but otherwise the kernel is unaffected (if it were
affected, a real fix would be needed). I.e. if we observe this data
race, KCSAN is helpfully pointing out that user space has a bug.

There are some options to deal with cases like this:

1. Do nothing, and just let KCSAN report the data race.

2. Somehow make KCSAN distinguish in-kernel data races that are due to
user space misusing the API. KCSAN can still show the race, but
clearly denote the nature of it by e.g. saying "KCSAN: user data-race
in ..." (instead of "KCSAN: data-race in ..."). This will require one
of 2 things:

    a. Distinguish the access by memory range. This doesn't seem
great, because I don't know if we can apply a general rule like "all
races involving this memory are user-space's fault". What if we have
data races in the memory range that aren't user-space's fault?

    b. Mark the accesses somehow, either by providing a region in
which all races are deemed user-space's fault. This is likely more
problematic than (a), because saying something like "all races in this
section of code are user-space's fault" may also hide real issues.

Because none of (2.a) or (2.b) seem great, at present I would opt for (1).

Anything better we can do here?

Thanks,
-- Marco