Re: [MPTCP] Re: [PATCH net-next v7 02/11] sock: Make sk_protocol a 16-bit value

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Miller <davem@xxxxxxxxxxxxx> wrote:
> From: Mat Martineau <mathew.j.martineau@xxxxxxxxxxxxxxx>
> Date: Thu,  9 Jan 2020 07:59:15 -0800
> 
> > Match the 16-bit width of skbuff->protocol. Fills an 8-bit hole so
> > sizeof(struct sock) does not change.
> > 
> > Also take care of BPF field access for sk_type/sk_protocol. Both of them
> > are now outside the bitfield, so we can use load instructions without
> > further shifting/masking.
> > 
> > v5 -> v6:
> >  - update eBPF accessors, too (Intel's kbuild test robot)
> > v2 -> v3:
> >  - keep 'sk_type' 2 bytes aligned (Eric)
> > v1 -> v2:
> >  - preserve sk_pacing_shift as bit field (Eric)
> > 
> > Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > Cc: bpf@xxxxxxxxxxxxxxx
> > Co-developed-by: Paolo Abeni <pabeni@xxxxxxxxxx>
> > Signed-off-by: Paolo Abeni <pabeni@xxxxxxxxxx>
> > Co-developed-by: Matthieu Baerts <matthieu.baerts@xxxxxxxxxxxx>
> > Signed-off-by: Matthieu Baerts <matthieu.baerts@xxxxxxxxxxxx>
> > Signed-off-by: Mat Martineau <mathew.j.martineau@xxxxxxxxxxxxxxx>
> 
> This is worrisome for me.
> 
> We have lots of places that now are going to be assigning  sk->sk_protocol
> into a u8 somewhere else.  A lot of them are ok because limits are enforced
> in various places, but for example:
> 
> net/ipv6/udp.c:	fl6.flowi6_proto = sk->sk_protocol;
> net/l2tp/l2tp_ip6.c:	fl6.flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/inet6_connection_sock.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> net/ipv6/af_inet6.c:		fl6.flowi6_proto = sk->sk_protocol;
> net/ipv6/datagram.c:	fl6->flowi6_proto = sk->sk_protocol;
> 
> This is one just one small example situation, where flowi6_proto is a u8.

There are parts in the stack (e.g. in setsockopt code paths) that test
sk->sk_protocol vs. IPPROTO_TCP, then call tcp specific code under the sane
assumption that sk is a tcp_sock struct.

With 8bit sk_protocol, mptcp_sock structs (which is what kernel gets via
file descriptor number) would be treated as a tcp socket, because
"IPPROTO_MPTCP & 0xff" yields IPPROTO_TCP.

Changing IPPROTO_MPTCP to a value <= 255 could lead to conflicts with
real inet protocols in the future, so we can't redefine it to a 8bit
value.

If we keep sk_protocol as 8bit field, we will need to make sure that all
places testing sk_protocol == IPPROTO_TCP gain an additional sanity check
to tell tcp and mptcp sockets apart.  Moreover, any further changes to
kernel code would need same extra test, so this is a non-starter to me.

Alternatively we could change the first member of mptcp_sk struct from
inet_connection_sock to a full tcp_sock struct.  Thats roughly 1k increase
of mptcp_sock struct to ~ 3744 bytes, but then we would not have to
worry about mptcp sockets ending up in tcp code paths.

If you think such a size increase is ok I could give that solution a shot
and see what other problems with 8bit sk_protocol might remain.

Mat reported /sys/kernel/debug/tracing/trace lists mptcp sockets as
IPPROTO_TCP in the '8 bit sk_protocol' case, but if thats the only issue
this might have a smaller/acceptable "avoidance fix".



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux