On June 29, 2021 4:21:59 PM GMT+08:00, Eric Dumazet <edumazet@xxxxxxxxxx> wrote: >On Tue, Jun 29, 2021 at 9:17 AM Nguyen Dinh Phi <phind.uet@xxxxxxxxx> >wrote: >> >> On June 29, 2021 1:20:19 AM GMT+08:00, Neal Cardwell ><ncardwell@xxxxxxxxxx> wrote: >> >) >> > >> >On Mon, Jun 28, 2021 at 1:15 PM Phi Nguyen <phind.uet@xxxxxxxxx> >wrote: >> >> >> >> On 6/29/2021 12:24 AM, Neal Cardwell wrote: >> >> >> >> > Thanks. >> >> > >> >> > Can you also please provide a summary of the event sequence that >> >> > triggers the bug? Based on your Reported-by tag, I guess this is >> >based >> >> > on the syzbot reproducer: >> >> > >> >> > >> >>https://groups.google.com/g/syzkaller-bugs/c/VbHoSsBz0hk/m/cOxOoTgPCAAJ >> >> > >> >> > but perhaps you can give a summary of the event sequence that >> >causes >> >> > the bug? Is it that the call: >> >> > >> >> > setsockopt$inet_tcp_TCP_CONGESTION(r0, 0x6, 0xd, >> >> > &(0x7f0000000000)='cdg\x00', 0x4) >> >> > >> >> > initializes the CC and happens before the connection is >> >established, >> >> > and then when the connection is established, the line that sets: >> >> > icsk->icsk_ca_initialized = 0; >> >> > is incorrect, causing the CC to be initialized again without >first >> >> > calling the cleanup code that deallocates the CDG-allocated >memory? >> >> > >> >> > thanks, >> >> > neal >> >> > >> >> >> >> Hi Neal, >> >> >> >> The gdb stack trace that lead to init_transfer_input() is as >bellow, >> >the >> >> current sock state is TCP_SYN_RECV. >> > >> >Thanks. That makes sense as a snapshot of time for >> >tcp_init_transfer(), but I think what would be more useful would be >a >> >description of the sequence of events, including when the CC was >> >initialized previous to that point (as noted above, was it that the >> >setsockopt(TCP_CONGESTION) completed before that point?). >> > >> >thanks, >> >neal >> >> I resend my message because I accidently used html format in last >one. I am very sorry for the inconvenience caused. >> --- >> Yes, the CC had been initialized by the setsockopt, after that, it >was initialized again in function tcp_init_transfer() because of >setting isck_ca_initialized to 0. > >"the setsockopt" is rather vague, sorry. > > >The hard part is that all scenarios have to be considered. > >TCP flows can either be passive and active. > >CC can be set : > >1) Before the connect() or accept() >2) After the connect() or accept() >3) after the connect() but before 3WHS is completed. > >So we need to make sure all cases will still work with any combination >of CDG CC (before/after) in the picture. > >Note that a memory leak for a restricted CC (CDG can only be used by >CAP_NET_ADMIN privileged user) > is a small problem compared to more serious bug that could be added >by an incomplete fix. > >I also note that if icsk_ca_priv] was increased from 104 to 120 bytes, >tcp_cdg would no longer need a dynamic memory allocation. > >Thank you. Hi, I will try to see whether I am able to get the full sequence. I am also affraid of making a change that could affect big part of the kernel. About CDG, how we can get rid of dynamic allocation by increasing icsk_priv_data to 120? because I see that the window size is a module parameter, so I guess it is not a fixed value. Because the problem only happens with CDG, is adding check in its tcp_cdg_init() function Ok? And about icsk_ca_initialized, Could I expect it to be 0 in CC's init functions? Thank you.