On Wed, Sep 7, 2022 at 9:53 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote: > > On 9/7/22 02:11, Eric Dumazet wrote: > > On Mon, Sep 5, 2022 at 12:06 AM Leonard Crestez <cdleonard@xxxxxxxxx> wrote: > >> > >> This is mainly intended to protect against local privilege escalations > >> through a rarely used feature so it is deliberately not namespaced. > >> > >> Enforcement is only at the setsockopt level, this should be enough to > >> ensure that the tcp_authopt_needed static key never turns on. > >> > >> No effort is made to handle disabling when the feature is already in > >> use. > >> > >> Signed-off-by: Leonard Crestez <cdleonard@xxxxxxxxx> > >> --- > >> Documentation/networking/ip-sysctl.rst | 6 ++++ > >> include/net/tcp_authopt.h | 1 + > >> net/ipv4/sysctl_net_ipv4.c | 39 ++++++++++++++++++++++++++ > >> net/ipv4/tcp_authopt.c | 25 +++++++++++++++++ > >> 4 files changed, 71 insertions(+) > >> > >> diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst > >> index a759872a2883..41be0e69d767 100644 > >> --- a/Documentation/networking/ip-sysctl.rst > >> +++ b/Documentation/networking/ip-sysctl.rst > >> @@ -1038,10 +1038,16 @@ tcp_challenge_ack_limit - INTEGER > >> Note that this per netns rate limit can allow some side channel > >> attacks and probably should not be enabled. > >> TCP stack implements per TCP socket limits anyway. > >> Default: INT_MAX (unlimited) > >> > >> +tcp_authopt - BOOLEAN > >> + Enable the TCP Authentication Option (RFC5925), a replacement for TCP > >> + MD5 Signatures (RFC2835). > >> + > >> + Default: 0 > >> + > > ... > > >> +#ifdef CONFIG_TCP_AUTHOPT > >> +static int proc_tcp_authopt(struct ctl_table *ctl, > >> + int write, void *buffer, size_t *lenp, > >> + loff_t *ppos) > >> +{ > >> + int val = sysctl_tcp_authopt; > > > > val = READ_ONCE(sysctl_tcp_authopt); > > > >> + struct ctl_table tmp = { > >> + .data = &val, > >> + .mode = ctl->mode, > >> + .maxlen = sizeof(val), > >> + .extra1 = SYSCTL_ZERO, > >> + .extra2 = SYSCTL_ONE, > >> + }; > >> + int err; > >> + > >> + err = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); > >> + if (err) > >> + return err; > >> + if (sysctl_tcp_authopt && !val) { > > > > READ_ONCE(sysctl_tcp_authopt) > > > > Note that this test would still be racy, because another cpu might > > change sysctl_tcp_authopt right after the read. > > What meaningful races are possible here? This is a variable that changes > from 0 to 1 at most once. Two cpus can issue writes of 0 and 1 values at the same time. Depending on scheduling writing the 0 can 'win' the race and overwrite the value back to 0. This is in clear violation of the claim you are making (that the sysctl can only go once from 0 to 1) > > In theory if two processes attempt to assign "non-zero" at the same time > then one will "win" and the other will get an error but races between > userspace writing different values are possible for any sysctl. The > solution seems to be "write sysctls from a single place". > > All the checks are in sockopts - in theory if the sysctl is written on > one CPU then a sockopt can still fail on another CPU until caches are > flushed. Is this what you're worried about? > > In theory doing READ_ONCE might incur a slight penalty on sockopt but > not noticeable. Not at all. There is _no_ penalty using READ_ONCE(). Unless it is done in a loop and this prevents some compiler optimization. Please use WRITE_ONCE() and READ_ONCE() for all sysctl values used in TCP stack (and elsewhere) See all the silly patches we had recently. > > > > >> + net_warn_ratelimited("Enabling TCP Authentication Option is permanent\n"); > >> + return -EINVAL; > >> + } > >> + sysctl_tcp_authopt = val; > > > > WRITE_ONCE(sysctl_tcp_authopt, val), or even better: > > > > if (val) > > cmpxchg(&sysctl_tcp_authopt, 0, val); > > > >> + return 0; > >> +} > >> +#endif > >> + > > This would be useful if we did any sort of initialization here but we > don't. Crypto is initialized somewhere completely different.