Re: napi_busy_poll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



在 2022/2/19 下午3:02, Olivier Langlois 写道:
On Fri, 2022-02-18 at 15:41 +0800, Hao Xu wrote:

Hi Oliver,

Have you tried just issue one recv/pollin request and observe the

napi_id?

Hi Hao, not precisely but you are 100% right about where the
association is done. It is when a packet is received that the
association is made. This happens in few places but the most likely
place where it can happen with my NIC (Intel igb) is inside
napi_gro_receive().

Yes, when a packet is received-->set skb->napi_id, when receiving a
batch of them-->deliver the skbs to the protocol layer and set
sk->napi_id

I do verify the socket napi_id once a WebSocket session is established.
At that point a lot of packets going back and forth have been
exchanged:

TCP handshake
TLS handshake
HTTP request requesting a WS upgrade

At that point, the napi_id has been assigned.

My problem was only that my socket packets were routed on the loopback
interface which has no napi devices associated to it.

I did remove the local SOCKS proxy out of my setup and NAPI ids started
to appear as expected.

  From my understanding of the network stack, the napi_id

of a socket won't be valid until it gets some packets. Because before

that moment, busy_poll doesn't know which hw queue to poll.

In other words, the idea of NAPI polling is: the packets of a socket

can be from any hw queue of a net adapter, but we just poll the

queue which just received some data. So to get this piece of info,

there must be some data coming to one queue, before doing the

busy_poll. Correct me if I'm wrong since I'm also a newbie of

network stuff...

I am now getting what you mean here. So there are 2 possible
approaches. You either:

1. add the napi id when you are sure that it is available after its
setting in the sock layer but you are not sure if it will be needed
again with future requests as it is too late to be of any use for the
current request (unless it is a MULTISHOT poll) (the add is performed
in io_poll_task_func() and io_apoll_task_func()

2. add the napi id when the request poll is armed where this knowledge
could be leveraged to handle the current req knowing that we might fail
getting the id if it is the initial recv request. (the add would be
performed in __io_arm_poll_handler)
I explains this in the patch.

TBH, I am not sure if there are arguments in favor of one approach over
the other... Maybe option #1 is the only one to make napi busy poll
work correctly with MULTISHOT requests...

I'll let you think about this point... Your first choice might be the
right one...

the other thing to consider when choosing the call site is locking...
when done from __io_arm_poll_handler(), uring_lock is acquired...

I am not sure that this is always the case with
io_poll_task_func/io_apoll_task_func...

I'll post v1 of the patch. My testing is showing that it works fine.
race condition is not an issue when busy poll is performed by sqpoll
thread because the list modification is exclusivy performed by that
thread too.

but I think that there is a possible race condition where the napi_list
could be used from io_cqring_wait() while another thread modify the
list. This is NOT done in my testing scenario but definitely something
that could happen somewhere in the real world...

Will there be any issue if we do the access with
list_for_each_entry_safe? I think it is safe enough.




I was considering to poll all the rx rings, but it seemed to be not

efficient from some tests by my colleague.

This is definitely the simplest implementation but I did not go as far
as testing it. There is too much unknown variables to be viable IMHO. I
am not too sure how many napi devices there can be in a typical server.
I know that in my test machine, it has 2 NICs and one of them is just
unconnected. If we were to loop through all the devices, we would be
polling wastefully at least half of all the devices on the system. That
does not sound like a very good approach.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux