On Thu, Dec 17, 2020 at 01:41:58AM +0900, Kuniyuki Iwashima wrote: [ ... ] > > There may also be places assuming that the req->rsk_listener will never > > change once it is assigned. not sure. have not looked closely yet. > > I have checked this again. There are no functions that expect explicitly > req->rsk_listener never change except for BUG_ON in inet_child_forget(). > No BUG_ON/WARN_ON does not mean they does not assume listener never > change, but such functions still work properly if rsk_listener is changed. The migration not only changes the ptr value of req->rsk_listener, it also means req is moved to another listener. (e.g. by updating the qlen of the old sk and new sk) Lets reuse the example about two cores at the TCP_NEW_SYN_RECV path racing to finish up the 3WHS. One core is already at inet_csk_complete_hashdance() doing "reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req))". What happen if another core migrates the req to another listener? Would the "reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req))" doing thing on the accept_queue that this req no longer belongs to? Also, from a quick look at reqsk_timer_handler() on how queue->young and req->num_timeout are updated, I am not sure the reqsk_queue_migrated() will work also: +static inline void reqsk_queue_migrated(struct request_sock_queue *old_accept_queue, + struct request_sock_queue *new_accept_queue, + const struct request_sock *req) +{ + atomic_dec(&old_accept_queue->qlen); + atomic_inc(&new_accept_queue->qlen); + + if (req->num_timeout == 0) { What if reqsk_timer_handler() is running in parallel and updating req->num_timeout? + atomic_dec(&old_accept_queue->young); + atomic_inc(&new_accept_queue->young); + } +} It feels like some of the "own_req" related logic may be useful here. not sure. could be something worth to think about. > > > > It probably needs some more thoughts here to get a simpler solution. > > Is it fine to move sock_hold() before assigning rsk_listener and defer > sock_put() to the end of tcp_v[46]_rcv() ? I don't see how this ordering helps, considering the migration can happen any time at another core. > > Also, we have to rewrite rsk_listener first and then call sock_put() in > reqsk_timer_handler() so that rsk_listener always has refcount more than 1. > > ---8<--- > struct sock *nsk, *osk; > bool migrated = false; > ... > sock_hold(req->rsk_listener); // (i) > sk = req->rsk_listener; > ... > if (sk->sk_state == TCP_CLOSE) { > osk = sk; > // do migration without sock_put() > sock_hold(nsk); // (ii) (as with (i)) > sk = nsk; > migrated = true; > } > ... > if (migrated) { > sock_put(sk); // pair with (ii) > sock_put(osk); // decrement old listener's refcount > sk = osk; > } > sock_put(sk); // pair with (i) > ---8<---