On Tue, 2022-03-01 at 02:26 +0800, Hao Xu wrote: > > On 2/25/22 13:32, Olivier Langlois wrote: > > On Mon, 2022-02-21 at 13:23 +0800, Hao Xu wrote: > > > > @@ -5776,6 +5887,7 @@ static int __io_arm_poll_handler(struct > > > > io_kiocb *req, > > > > __io_poll_execute(req, mask); > > > > return 0; > > > > } > > > > + io_add_napi(req->file, req->ctx); > > > I think this may not be the right place to do it. the process > > > will > > > be: > > > arm_poll sockfdA--> get invalid napi_id from sk->napi_id --> > > > event > > > triggered --> arm_poll for sockfdA again --> get valid napi_id > > > then why not do io_add_napi() in event > > > handler(apoll_task_func/poll_task_func). > > You have a valid concern that the first time a socket is passed to > > io_uring that napi_id might not be assigned yet. > > > > OTOH, getting it after data is available for reading does not help > > neither since busy polling must be done before data is received. > > > > for both places, the extracted napi_id will only be leveraged at > > the > > next polling. > > Hi Olivier, > > I think we have some gap here. AFAIK, it's not 'might not', it is > > 'definitely not', the sk->napi_id won't be valid until the poll > callback. > > Some driver's code FYR: > (drivers/net/ethernet/intel/e1000/e1000_main.c) > > e1000_receive_skb-->napi_gro_receive-->napi_skb_finish-- > >gro_normal_one > > and in gro_normal_one(), it does: > > if (napi->rx_count >= gro_normal_batch) > gro_normal_list(napi); > > > The gro_normal_list() delivers the info up to the specifical network > protocol like tcp. > > And then sk->napi_id is set, meanwhile the poll callback is > triggered. > > So that's why I call the napi polling technology a 'speculation'. > It's > totally for the > > future data. Correct me if I'm wrong especially for the poll callback > triggering part. > When I said 'might not', I was meaning that from the io_uring point of view, it has no idea what is the previous socket usage. If it has been used outside io_uring, the napi_id could available on the first call. If it is really read virgin socket, neither my choosen call site or your proposed sites will make the napi busy poll possible for the first poll. I feel like there is not much to gain to argue on this point since I pretty much admitted that your solution was most likely the only call site making MULTIPOLL requests work correctly with napi busy poll as those requests could visit __io_arm_poll_handler only once (Correct me if my statement is wrong). The only issue was that I wasn't sure is how using your calling sites would make locking work. I suppose that adding a dedicated spinlock for protecting napi_list instead of relying on uring_lock could be a solution. Would that work?