Re: [PATCH 0/2] abstract napi tracking strategy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/13/24 22:25, Olivier Langlois wrote:
On Tue, 2024-08-13 at 12:33 -0600, Jens Axboe wrote:
On 8/13/24 10:44 AM, Olivier Langlois wrote:
the actual napi tracking strategy is inducing a non-negligeable
overhead.
Everytime a multishot poll is triggered or any poll armed, if the
napi is
enabled on the ring a lookup is performed to either add a new napi
id into
the napi_list or its timeout value is updated.

For many scenarios, this is overkill as the napi id list will be
pretty
much static most of the time. To address this common scenario, a
new
abstraction has been created following the common Linux kernel
idiom of
creating an abstract interface with a struct filled with function
pointers.

Creating an alternate napi tracking strategy is therefore made in 2
phases.

1. Introduce the io_napi_tracking_ops interface
2. Implement a static napi tracking by defining a new
io_napi_tracking_ops

I don't think we should create ops for this, unless there's a strict
need to do so. Indirect function calls aren't cheap, and the CPU side
mitigations for security issues made them worse.

You're not wrong that ops is not an uncommon idiom in the kernel, but
it's a lot less prevalent as a solution than it used to. Exactly
because
of the above reasons.

ok. Do you have a reference explaining this?
and what type of construct would you use instead?

AFAIK, a big performance killer is the branch mispredictions coming
from big switch/case or if/else if/else blocks and it was precisely the
reason why you removed the big switch/case io_uring was having with
function pointers in io_issue_def...

Compilers can optimise switch-case very well, look up what jump
tables is, often works even better than indirect functions even
without mitigations. And it wasn't converted because of performance,
it was a nice efficient jump table before.

And not like compilers can devirtualise indirect calls either, I'd
say it hits the pipeline even harder. Maybe not as hard as a long
if-else-if in the final binary, but jump tables help and we're
talking about a single "if".

I totally agree, it's way over engineered.

I consumme an enormous amount of programming learning material daily
and this is the first time that I am hearing this.

If there was a performance concern about this type of construct and
considering that my main programming language is C++, I am bit
surprised that I have not seen anything about some problems with C++
vtbls...

Even without mitigation business, we can look up a lot about
devirtualisation, which is also why "final" keyword exists in c++.

--
Pavel Begunkov




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux