Re: BUG() can be hit in tcp_collapse()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2016-11-10 at 07:44 -0800, Eric Dumazet wrote:
> On Thu, 2016-11-10 at 09:47 -0500, Vladis Dronov wrote:
> > Hello,
> > 
> > It was discovered by Marco Grassi <marco.gra@xxxxxxxxx> (many thanks) that the
> > latest stable Linux kernel v4.8.6 is crashing in tcp_collapse() after making
> > certain syscalls:
> > 
> > [    9.622886] kernel BUG at net/ipv4/tcp_input.c:4813!
> > [    9.623299] invalid opcode: 0000 [#1] SMP
> > [    9.623642] Modules linked in: iptable_nat nf_nat_ipv4 nf_nat
> > [    9.624287] CPU: 2 PID: 2871 Comm: poc Not tainted 4.8.6 #2
> > [    9.624730] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
> > [    9.625459] task: ffff8801387b9a00 task.stack: ffff8801380e4000
> > [    9.625929] RIP: 0010:[<ffffffff8178d4ec>]  [<ffffffff8178d4ec>] tcp_collapse+0x3ac/0x3b0
> > [    9.626609] RSP: 0018:ffff8801380e7b78  EFLAGS: 00010282
> > [    9.627028] RAX: 00000000fffffff2 RBX: 0000000000000ec0 RCX: 0000000000000ec0
> > [    9.627587] RDX: ffff8801365cd000 RSI: 0000000000000000 RDI: ffff8801364106e0
> > [    9.628142] RBP: ffff8801380e7bc8 R08: 0000000000000000 R09: ffff88013b003300
> > [    9.628704] R10: ffff8801365cd000 R11: 0000000000000000 R12: 0000000000000ec0
> > [    9.629259] R13: ffff88013663ae00 R14: 00000000cdf0ca26 R15: ffff8801364106e0
> > [    9.629819] FS:  00007f2cef695800(0000) GS:ffff88013fc80000(0000) knlGS:0000000000000000
> > [    9.630945] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    9.631655] CR2: 000000002002a000 CR3: 0000000139d46000 CR4: 00000000001406e0
> > [    9.632462] Stack:
> > [    9.632900]  0000000000000000 cdf0da2600000001 ffff880138050000 ffff8801380500a8
> > [    9.634138]  ffff880100000000 ffff880138050688 0000000000000900 ffff8801364136e0
> > [    9.635379]  ffff880138050000 ffff880138050688 ffff8801380e7c00 ffffffff8178d630
> > [    9.636622] Call Trace:
> > [    9.637087]  [<ffffffff8178d630>] tcp_try_rmem_schedule+0x140/0x380
> > [    9.637834]  [<ffffffff81791aa8>] tcp_data_queue+0x898/0xcf0
> > [    9.638538]  [<ffffffff8179210b>] tcp_rcv_established+0x20b/0x6c0
> > [    9.639268]  [<ffffffff81710143>] ? sk_reset_timer+0x13/0x30
> > [    9.639968]  [<ffffffff81813009>] tcp_v6_do_rcv+0x1b9/0x420
> > [    9.640666]  [<ffffffff81710b02>] __release_sock+0x82/0xf0
> > [    9.641353]  [<ffffffff81710b9b>] release_sock+0x2b/0x90
> > [    9.642029]  [<ffffffff817890ca>] tcp_sendmsg+0x55a/0xb60
> > [    9.642714]  [<ffffffff817b29d0>] inet_sendmsg+0x60/0x90
> > [    9.643389]  [<ffffffff8170c7b3>] sock_sendmsg+0x33/0x40
> > [    9.644064]  [<ffffffff8170ccee>] SYSC_sendto+0xee/0x160
> > [    9.645530]  [<ffffffff8170d6f9>] SyS_sendto+0x9/0x10
> > [    9.646190]  [<ffffffff81909df2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
> > [    9.646947] Code: 48 c7 07 00 00 00 00 48 89 42 08 48 89 10 e8 cc 7e f8 ff 49 8b 47 30 48 8b 80 80 01 00 00 65 48 ff 80 b0 01 00 00 e9 72 fd ff ff <0f> 0b 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe 53 8b 
> > [    9.651794] RIP  [<ffffffff8178d4ec>] tcp_collapse+0x3ac/0x3b0
> > [    9.652554]  RSP <ffff8801380e7b78>
> > 
> > The reproducer is generated by the syzkaller, please, see attached. The
> > following BUG() is hit:
> > 
> > [net/ipv4/tcp_input.c]
> > static void
> > tcp_collapse(struct sock *sk, struct sk_buff_head *list,
> >              struct sk_buff *head, struct sk_buff *tail,
> >              u32 start, u32 end)
> > {
> > ...
> > /* Copy data, releasing collapsed skbs. */
> > while (copy > 0) {
> >         int offset = start - TCP_SKB_CB(skb)->seq;
> >         int size = TCP_SKB_CB(skb)->end_seq - start;
> > 
> >         BUG_ON(offset < 0);
> >         if (size > 0) {
> >                 size = min(copy, size);
> > 4812:           if (skb_copy_bits(skb, offset, skb_put(nskb, size), size))
> > 4813:                   BUG();
> > 
> > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4812
> > 0xffffffff8178d390 <tcp_collapse+0x250>:        mov    %r12d,%esi
> > 0xffffffff8178d393 <tcp_collapse+0x253>:        callq  0xffffffff81713ce0 <skb_put>
> > 0xffffffff8178d398 <tcp_collapse+0x258>:        mov    -0x30(%rbp),%r8d
> > 0xffffffff8178d39c <tcp_collapse+0x25c>:        mov    %r12d,%ecx
> > 0xffffffff8178d39f <tcp_collapse+0x25f>:        mov    %rax,%rdx
> > 0xffffffff8178d3a2 <tcp_collapse+0x262>:        mov    %r15,%rdi
> > 0xffffffff8178d3a5 <tcp_collapse+0x265>:        mov    %r8d,%esi
> > 0xffffffff8178d3a8 <tcp_collapse+0x268>:        callq  0xffffffff81714b90 <skb_copy_bits>
> > 0xffffffff8178d3ad <tcp_collapse+0x26d>:        test   %eax,%eax
> > 0xffffffff8178d3af <tcp_collapse+0x26f>:        jne    0xffffffff8178d4ec <tcp_collapse+0x3ac>
> > ...
> > /usr/src/linux-4.8.6/net/ipv4/tcp_input.c: 4813
> > 0xffffffff8178d4ec <tcp_collapse+0x3ac>:        ud2    
> > 
> > I have checked that the reproducer can cause hitting this BUG() in the kernels
> > since, at least v4.0. I was not checking the earlier kernels except RHEL-7 ones
> > (3.10.0-xxx) which are not vulnerable.
> > 
> > The upstream kernels since v4.9-rc1 are not vulnerable too and I have bisected
> > the repo to the commit c9c3321257 which fixes the issue.
> > 
> > $ git tag --contain c9c3321257e1b95be9b375f811fb250162af8d39
> > v4.9-rc1
> > 
> > Stable v4.8.6 kernel with the c9c3321257 commit applied does not hit the BUG(),
> > so I believe this commit should be backported to the stable branch. This commit
> > applies cleanly to the v4.8.6 tree with just line offsets.
> > 
> > Meanwhile, I see that commit c9c3321257 just increases(?) an skb buffer(?)
> > (which fixes hitting the BUG() for this exact reproducer), but does not fix the
> > real reason (so another set of syscalls still may cause hitting the BUG()). This
> > is why I'm emailing not only to stable@, but also to netdev@, asking to review the
> > data above and probably develop a fix.
> 
> I do not believe c9c3321257 is a proper fix for this issue.
> 
> It only works around the bug for this specific use case, probably
> because of :
> 
> +       /* In case all data was pulled from skb frags (in __pskb_pull_tail()),
> +        * we can fix skb->truesize to its real value to avoid future drops.
> +        * This is valid because skb is not yet charged to the socket.
> +        * It has been noticed pure SACK packets were sometimes dropped
> +        * (if cooked by drivers without copybreak feature).
> +        */
> +       if (!skb->data_len)
> +               skb->truesize = SKB_TRUESIZE(skb_end_offset(skb));
> 
> Meaning that (tcp_win_from_space(skb->truesize) > skb->len) expression
> result differs in tcp_collapse() later.

The issue is that sk_filter() truncates an incoming packet to a smaller
value.

Bad things happen because TCP_SKB_CB(skb)->end_seq is not updated.

I guess other issues would also happen if the truncation also removes
part of tcp header.

sk_filter_trim_cap(sk, skb, tcp_hlen) would be needed,
or sk_filter_trim_cap(sk, skb, skb->len) to only ACCEPT/DROP packets,
but no truncations.






--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]