(trimmed To:) On 09/23/16 11:21, Greg Kroah-Hartman wrote: > On Fri, Sep 23, 2016 at 10:43:27AM +0200, Holger Hoffstätte wrote: >> On 09/23/16 10:14, Greg Kroah-Hartman wrote: >>> On Thu, Sep 22, 2016 at 09:56:28PM +0200, Holger Hoffstätte wrote: >>>> did you forget to add the -net patches in stable-queue.git/tree/net-4.4 >>>> or are they on hold for some reason? I think they were supposed to go >>>> into .22. >>> >>> Really? I just put them there as I was pondering applying them, waiting >>> for the network maintainer to send me some more patches, or not. They >> >> Ahh..alright then. >> >>> Why do you think they should go into this release? >> >> Never mind - they are not really urgent. I just had most of them merged >> locally already and was just wondering. > > If you have them merged, and tested, mind sending them to me with your > tested-by mark on them? That way I know they work and can focus on the > ones that will require backporting effort. Attached. All with Tested-By are working and have done so for some time without issue (mixed ipv4/6 network in various functions). ipv6-addrconf-fix-dev-refcont-leak-when-dad-failed: seems to be already applied from a previous series, not sure which. tcp-cwnd-does-not-increase-in-tcp-yeah: For this I can only offer a Reviewed-by: since I cannot test it, but I did check that the code does what the comment says, and it merges/compiles cleanly. Hope this helps. Holger
>From foo@baz Wed Sep 21 12:45:10 CEST 2016 From: David Forster <dforster@xxxxxxxxxxx> Date: Wed, 3 Aug 2016 15:13:01 +0100 Subject: ipv4: panic in leaf_walk_rcu due to stale node pointer From: David Forster <dforster@xxxxxxxxxxx> [ Upstream commit 94d9f1c5906b20053efe375b6d66610bca4b8b64 ] Panic occurs when issuing "cat /proc/net/route" whilst populating FIB with > 1M routes. Use of cached node pointer in fib_route_get_idx is unsafe. BUG: unable to handle kernel paging request at ffffc90001630024 IP: [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 PGD 11b08d067 PUD 11b08e067 PMD dac4b067 PTE 0 Oops: 0000 [#1] SMP Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscac snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep virti acpi_cpufreq button parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd tio_ring virtio floppy uhci_hcd ehci_hcd usbcore usb_common libata scsi_mod CPU: 1 PID: 785 Comm: cat Not tainted 4.2.0-rc8+ #4 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff8800da1c0bc0 ti: ffff88011a05c000 task.ti: ffff88011a05c000 RIP: 0010:[<ffffffff814cf6a0>] [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP: 0018:ffff88011a05fda0 EFLAGS: 00010202 RAX: ffff8800d8a40c00 RBX: ffff8800da4af940 RCX: ffff88011a05ff20 RDX: ffffc90001630020 RSI: 0000000001013531 RDI: ffff8800da4af950 RBP: 0000000000000000 R08: ffff8800da1f9a00 R09: 0000000000000000 R10: ffff8800db45b7e4 R11: 0000000000000246 R12: ffff8800da4af950 R13: ffff8800d97a74c0 R14: 0000000000000000 R15: ffff8800d97a7480 FS: 00007fd3970e0700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffc90001630024 CR3: 000000011a7e4000 CR4: 00000000000006e0 Stack: ffffffff814d00d3 0000000000000000 ffff88011a05ff20 ffff8800da1f9a00 ffffffff811dd8b9 0000000000000800 0000000000020000 00007fd396f35000 ffffffff811f8714 0000000000003431 ffffffff8138dce0 0000000000000f80 Call Trace: [<ffffffff814d00d3>] ? fib_route_seq_start+0x93/0xc0 [<ffffffff811dd8b9>] ? seq_read+0x149/0x380 [<ffffffff811f8714>] ? fsnotify+0x3b4/0x500 [<ffffffff8138dce0>] ? process_echoes+0x70/0x70 [<ffffffff8121cfa7>] ? proc_reg_read+0x47/0x70 [<ffffffff811bb823>] ? __vfs_read+0x23/0xd0 [<ffffffff811bbd42>] ? rw_verify_area+0x52/0xf0 [<ffffffff811bbe61>] ? vfs_read+0x81/0x120 [<ffffffff811bcbc2>] ? SyS_read+0x42/0xa0 [<ffffffff81549ab2>] ? entry_SYSCALL_64_fastpath+0x16/0x75 Code: 48 85 c0 75 d8 f3 c3 31 c0 c3 f3 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 a 04 89 f0 33 02 44 89 c9 48 d3 e8 0f b6 4a 05 49 89 RIP [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP <ffff88011a05fda0> CR2: ffffc90001630024 Signed-off-by: Dave Forster <dforster@xxxxxxxxxxx> Acked-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> --- net/ipv4/fib_trie.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2453,9 +2453,7 @@ struct fib_route_iter { static struct key_vector *fib_route_get_idx(struct fib_route_iter *iter, loff_t pos) { - struct fib_table *tb = iter->main_tb; struct key_vector *l, **tp = &iter->tnode; - struct trie *t; t_key key; /* use cache location of next-to-find key */ @@ -2463,8 +2461,6 @@ static struct key_vector *fib_route_get_ pos -= iter->pos; key = iter->key; } else { - t = (struct trie *)tb->tb_data; - iter->tnode = t->kv; iter->pos = 0; key = 0; } @@ -2505,12 +2501,12 @@ static void *fib_route_seq_start(struct return NULL; iter->main_tb = tb; + t = (struct trie *)tb->tb_data; + iter->tnode = t->kv; if (*pos != 0) return fib_route_get_idx(iter, *pos); - t = (struct trie *)tb->tb_data; - iter->tnode = t->kv; iter->pos = 0; iter->key = 0;
>From foo@baz Wed Sep 21 12:45:10 CEST 2016 From: Dave Jones <davej@xxxxxxxxxxxxxxxxx> Date: Fri, 2 Sep 2016 14:39:50 -0400 Subject: ipv6: release dst in ping_v6_sendmsg From: Dave Jones <davej@xxxxxxxxxxxxxxxxx> [ Upstream commit 03c2778a938aaba0893f6d6cdc29511d91a79848 ] Neither the failure or success paths of ping_v6_sendmsg release the dst it acquires. This leads to a flood of warnings from "net/core/dst.c:288 dst_release" on older kernels that don't have 8bf4ada2e21378816b28205427ee6b0e1ca4c5f1 backported. That patch optimistically hoped this had been fixed post 3.10, but it seems at least one case wasn't, where I've seen this triggered a lot from machines doing unprivileged icmp sockets. Cc: Martin Lau <kafai@xxxxxx> Signed-off-by: Dave Jones <davej@xxxxxxxxxxxxxxxxx> Acked-by: Martin KaFai Lau <kafai@xxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> --- net/ipv6/ping.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/net/ipv6/ping.c +++ b/net/ipv6/ping.c @@ -150,8 +150,10 @@ int ping_v6_sendmsg(struct sock *sk, str rt = (struct rt6_info *) dst; np = inet6_sk(sk); - if (!np) - return -EBADF; + if (!np) { + err = -EBADF; + goto dst_err_out; + } if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr)) fl6.flowi6_oif = np->mcast_oif; @@ -186,6 +188,9 @@ int ping_v6_sendmsg(struct sock *sk, str } release_sock(sk); +dst_err_out: + dst_release(dst); + if (err) return err;
>From foo@baz Wed Sep 21 12:45:10 CEST 2016 From: Artem Germanov <agermanov@xxxxxxxxxxxxxx> Date: Wed, 7 Sep 2016 10:49:36 -0700 Subject: tcp: cwnd does not increase in TCP YeAH From: Artem Germanov <agermanov@xxxxxxxxxxxxxx> [ Upstream commit db7196a0d0984b933ccf2cd6a60e26abf466e8a3 ] Commit 76174004a0f19785a328f40388e87e982bbf69b9 (tcp: do not slow start when cwnd equals ssthresh ) introduced regression in TCP YeAH. Using 100ms delay 1% loss virtual ethernet link kernel 4.2 shows bandwidth ~500KB/s for single TCP connection and kernel 4.3 and above (including 4.8-rc4) shows bandwidth ~100KB/s. That is caused by stalled cwnd when cwnd equals ssthresh. This patch fixes it by proper increasing cwnd in this case. Signed-off-by: Artem Germanov <agermanov@xxxxxxxxxxxxxx> Acked-by: Dmitry Adamushko <d.adamushko@xxxxxxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Reviewed-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> --- net/ipv4/tcp_yeah.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/net/ipv4/tcp_yeah.c +++ b/net/ipv4/tcp_yeah.c @@ -75,7 +75,7 @@ static void tcp_yeah_cong_avoid(struct s if (!tcp_is_cwnd_limited(sk)) return; - if (tp->snd_cwnd <= tp->snd_ssthresh) + if (tcp_in_slow_start(tp)) tcp_slow_start(tp, acked); else if (!yeah->doing_reno_now) {
>From foo@baz Wed Sep 21 12:45:10 CEST 2016 From: Eric Dumazet <edumazet@xxxxxxxxxx> Date: Wed, 17 Aug 2016 05:56:26 -0700 Subject: tcp: fix use after free in tcp_xmit_retransmit_queue() From: Eric Dumazet <edumazet@xxxxxxxxxx> [ Upstream commit bb1fceca22492109be12640d49f5ea5a544c6bb4 ] When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the tail of the write queue using tcp_add_write_queue_tail() Then it attempts to copy user data into this fresh skb. If the copy fails, we undo the work and remove the fresh skb. Unfortunately, this undo lacks the change done to tp->highest_sack and we can leave a dangling pointer (to a freed skb) Later, tcp_xmit_retransmit_queue() can dereference this pointer and access freed memory. For regular kernels where memory is not unmapped, this might cause SACK bugs because tcp_highest_sack_seq() is buggy, returning garbage instead of tp->snd_nxt, but with various debug features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel. This bug was found by Marco Grassi thanks to syzkaller. Fixes: 6859d49475d4 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb") Reported-by: Marco Grassi <marco.gra@xxxxxxxxx> Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx> Cc: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx> Cc: Yuchung Cheng <ycheng@xxxxxxxxxx> Cc: Neal Cardwell <ncardwell@xxxxxxxxxx> Acked-by: Neal Cardwell <ncardwell@xxxxxxxxxx> Reviewed-by: Cong Wang <xiyou.wangcong@xxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> --- include/net/tcp.h | 2 ++ 1 file changed, 2 insertions(+) --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1510,6 +1510,8 @@ static inline void tcp_check_send_head(s { if (sk->sk_send_head == skb_unlinked) sk->sk_send_head = NULL; + if (tcp_sk(sk)->highest_sack == skb_unlinked) + tcp_sk(sk)->highest_sack = NULL; } static inline void tcp_init_send_head(struct sock *sk)
>From foo@baz Wed Sep 21 12:45:10 CEST 2016 From: Eric Dumazet <edumazet@xxxxxxxxxx> Date: Mon, 22 Aug 2016 11:31:10 -0700 Subject: tcp: properly scale window in tcp_v[46]_reqsk_send_ack() From: Eric Dumazet <edumazet@xxxxxxxxxx> [ Upstream commit 20a2b49fc538540819a0c552877086548cff8d8d ] When sending an ack in SYN_RECV state, we must scale the offered window if wscale option was negotiated and accepted. Tested: Following packetdrill test demonstrates the issue : 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // Establish a connection. +0 < S 0:0(0) win 20000 <mss 1000,sackOK,wscale 7, nop, TS val 100 ecr 0> +0 > S. 0:0(0) ack 1 win 28960 <mss 1460,sackOK, TS val 100 ecr 100, nop, wscale 7> +0 < . 1:11(10) ack 1 win 156 <nop,nop,TS val 99 ecr 100> // check that window is properly scaled ! +0 > . 1:1(0) ack 1 win 226 <nop,nop,TS val 200 ecr 100> Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx> Cc: Yuchung Cheng <ycheng@xxxxxxxxxx> Cc: Neal Cardwell <ncardwell@xxxxxxxxxx> Acked-by: Yuchung Cheng <ycheng@xxxxxxxxxx> Acked-by: Neal Cardwell <ncardwell@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx> --- net/ipv4/tcp_ipv4.c | 8 +++++++- net/ipv6/tcp_ipv6.c | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -808,8 +808,14 @@ static void tcp_v4_reqsk_send_ack(const u32 seq = (sk->sk_state == TCP_LISTEN) ? tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt; + /* RFC 7323 2.3 + * The window field (SEG.WND) of every outgoing segment, with the + * exception of <SYN> segments, MUST be right-shifted by + * Rcv.Wind.Shift bits: + */ tcp_v4_send_ack(sock_net(sk), skb, seq, - tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd, + tcp_rsk(req)->rcv_nxt, + req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp, req->ts_recent, 0, --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -932,9 +932,15 @@ static void tcp_v6_reqsk_send_ack(const /* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV * sk->sk_state == TCP_SYN_RECV -> for Fast Open. */ + /* RFC 7323 2.3 + * The window field (SEG.WND) of every outgoing segment, with the + * exception of <SYN> segments, MUST be right-shifted by + * Rcv.Wind.Shift bits: + */ tcp_v6_send_ack(sk, skb, (sk->sk_state == TCP_LISTEN) ? tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt, - tcp_rsk(req)->rcv_nxt, req->rsk_rcv_wnd, + tcp_rsk(req)->rcv_nxt, + req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale, tcp_time_stamp, req->ts_recent, sk->sk_bound_dev_if, tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr), 0, 0);