Re: [PATCH 0/2] abstract napi tracking strategy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/13/24 3:25 PM, Olivier Langlois wrote:
> On Tue, 2024-08-13 at 12:33 -0600, Jens Axboe wrote:
>> On 8/13/24 10:44 AM, Olivier Langlois wrote:
>>> the actual napi tracking strategy is inducing a non-negligeable
>>> overhead.
>>> Everytime a multishot poll is triggered or any poll armed, if the
>>> napi is
>>> enabled on the ring a lookup is performed to either add a new napi
>>> id into
>>> the napi_list or its timeout value is updated.
>>>
>>> For many scenarios, this is overkill as the napi id list will be
>>> pretty
>>> much static most of the time. To address this common scenario, a
>>> new
>>> abstraction has been created following the common Linux kernel
>>> idiom of
>>> creating an abstract interface with a struct filled with function
>>> pointers.
>>>
>>> Creating an alternate napi tracking strategy is therefore made in 2
>>> phases.
>>>
>>> 1. Introduce the io_napi_tracking_ops interface
>>> 2. Implement a static napi tracking by defining a new
>>> io_napi_tracking_ops
>>
>> I don't think we should create ops for this, unless there's a strict
>> need to do so. Indirect function calls aren't cheap, and the CPU side
>> mitigations for security issues made them worse.
>>
>> You're not wrong that ops is not an uncommon idiom in the kernel, but
>> it's a lot less prevalent as a solution than it used to. Exactly
>> because
>> of the above reasons.
>>
> ok. Do you have a reference explaining this?
> and what type of construct would you use instead?

See all the spectre nonsense, and the mitigations that followed from
that.

> AFAIK, a big performance killer is the branch mispredictions coming
> from big switch/case or if/else if/else blocks and it was precisely the
> reason why you removed the big switch/case io_uring was having with
> function pointers in io_issue_def...

For sure, which is why io_uring itself ended up using indirect function
calls, because the table just became unwieldy. But that's a different
case from adding it for just a single case, or two. For those, branch
prediction should be fine, as it would always have the same outcome.

> I consumme an enormous amount of programming learning material daily
> and this is the first time that I am hearing this.

The kernel and backend programming are a bit different in that regard,
for better or for worse.

> If there was a performance concern about this type of construct and
> considering that my main programming language is C++, I am bit
> surprised that I have not seen anything about some problems with C++
> vtbls...

It's definitely slower than a direct function call, regardless of
whether this is in the kernel or not. Can be mitigated by having the
common case be predicted with a branch. See INDIRECT_CALL_*() in the
kernel.

> but oh well, I am learning new stuff everyday, so please share the
> references you have about the topic so that I can perfect my knowledge.

I think lwn had a recent thing on indirect function calls as it pertains
to the security modules, I'd check that first. But the spectre thing
above is likely all you need!

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux