Re: KASAN: use-after-free Read in sock_release

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 29 Nov 2017 12:24:55 -0800

On Wed, Nov 29, 2017 at 11:37 AM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> (Cc'ing fs people...)
>
> On Wed, Nov 29, 2017 at 12:33 AM, syzbot wrote:
>> BUG: KASAN: use-after-free in sock_release+0x1c6/0x1e0 net/socket.c:601

Lovely.

Yeah, that is:

   601          if (rcu_dereference_protected(sock->wq, 1)->fasync_list)

and as you say, that "rcu_dereference_protected()" is confusing, but
that should be ok because we have a ref to the inode, and we're really
just testing that the pointer is zero.

The call trace here is:

>>  sock_release+0x1c6/0x1e0 net/socket.c:601
>>  sock_close+0x16/0x20 net/socket.c:1125
>>  __fput+0x333/0x7f0 fs/file_table.c:210
>>  ____fput+0x15/0x20 fs/file_table.c:244
>>  task_work_run+0x199/0x270 kernel/task_work.c:113

and there is no RCU protection anywhere, but it's really just a sanity
check, and the access _should_ be ok.

The stale access does seem to be because 'sock' (embedded in the
inode) itself that has been free'd:

>> Allocated by task 31066:
>>  kmalloc include/linux/slab.h:499 [inline]
>>  sock_alloc_inode+0xb4/0x300 net/socket.c:253
>>  alloc_inode+0x65/0x180 fs/inode.c:208
>>  new_inode_pseudo+0x69/0x190 fs/inode.c:890
>>  sock_alloc+0x41/0x270 net/socket.c:565
>>  __sock_create+0x148/0x850 net/socket.c:1225
>>  sock_create net/socket.c:1301 [inline]
>>  SYSC_socket net/socket.c:1331 [inline]
>>  SyS_socket+0xeb/0x200 net/socket.c:1311
>
> This looks more like a fs issue than network, my fs knowledge
> is not good enough to justify why the hell the inode could be
> destroyed before we release the fd.

Ugh. The inode freeing really is confusing and fairly involved, but
the last free *should* happen as part of the final dput() that is done
at the end of __fput().

So in __fput() calls into the

        if (file->f_op->release)
                file->f_op->release(inode, file);

then the inode should still be around, because the final ref won't be
done until later. And RCU simply shouldn't be an issue, because of
that reference count on the inode.

So it smells like some reference counting went wrong. The socket inode
creation is a bit confusing, and then in "sock_release()" we do have
that

        if (!sock->file) {
                iput(SOCK_INODE(sock));
                return;
        }
        sock->file = NULL;

which *also* tries to free the inode. I'm not sure what the logic (and
what the locking) behind that code all is.

What *is* the locking for "sock->file" anyway?

Al, can you take a look on the vfs side? But I'm inclined to blame the
socket code, because if we really had a "inode free'd early" issue at
a vfs level, I'd have expected us to see infinite chaos.

             Linus