On Wed, Nov 23, 2022 at 05:44:06PM +0800, Firo Yang wrote: > Recently, a customer reported that from their container whose > net namespace is different to the host's init_net, they can't set > the container's net.sctp.rto_max to any value smaller than > init_net.sctp.rto_min. > > For instance, > Host: > sudo sysctl net.sctp.rto_min > net.sctp.rto_min = 1000 > > Container: > echo 100 > /mnt/proc-net/sctp/rto_min > echo 400 > /mnt/proc-net/sctp/rto_max > echo: write error: Invalid argument > > This is caused by the check made from this'commit 4f3fdf3bc59c > ("sctp: add check rto_min and rto_max in sysctl")' > When validating the input value, it's always referring the boundary > value set for the init_net namespace. > > Having container's rto_max smaller than host's init_net.sctp.rto_min > does make sense. Considering that the rto between two containers on the > same host is very likely smaller than it for two hosts. Makes sense. And also, here, it is not using the init_net as boundaries for the values themselves. I mean, rto_min in init_net won't be the minimum allowed for rto_min in other netns. Ditto for rto_max. More below. > > So to fix this problem, just referring the boundary value from the net > namespace where the new input value came from shold be enough. > > Signed-off-by: Firo Yang <firo.yang@xxxxxxxx> > --- > net/sctp/sysctl.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c > index b46a416787ec..e167df4dc60b 100644 > --- a/net/sctp/sysctl.c > +++ b/net/sctp/sysctl.c > @@ -429,6 +429,9 @@ static int proc_sctp_do_rto_min(struct ctl_table *ctl, int write, > else > tbl.data = &net->sctp.rto_min; > > + if (net != &init_net) > + max = net->sctp.rto_max; This also affects other sysctls: $ grep -e procname -e extra sysctl.c | grep -B1 extra.*init_net .extra1 = SYSCTL_ONE, .extra2 = &init_net.sctp.rto_max .procname = "rto_max", .extra1 = &init_net.sctp.rto_min, -- .extra1 = SYSCTL_ZERO, .extra2 = &init_net.sctp.ps_retrans, .procname = "ps_retrans", .extra1 = &init_net.sctp.pf_retrans, And apparently, SCTP is the only one doing such dynamic limits. At least in networking. While the issue you reported is fixable this way, for ps/pf_retrans, it is not, as it is using proc_dointvec_minmax() and it will simply consume those values (with no netns translation). So what about patching sctp_sysctl_net_register() instead, to update these pointers during netns creation? Right after where it update the 'data' one in there: for (i = 0; table[i].data; i++) table[i].data += (char *)(&net->sctp) - (char *)&init_net.sctp; Thanks, Marcelo > + > ret = proc_dointvec(&tbl, write, buffer, lenp, ppos); > if (write && ret == 0) { > if (new_value > max || new_value < min) > @@ -457,6 +460,9 @@ static int proc_sctp_do_rto_max(struct ctl_table *ctl, int write, > else > tbl.data = &net->sctp.rto_max; > > + if (net != &init_net) > + min = net->sctp.rto_min; > + > ret = proc_dointvec(&tbl, write, buffer, lenp, ppos); > if (write && ret == 0) { > if (new_value > max || new_value < min) > -- > 2.26.2 >