On 8/13/24 3:25 PM, Olivier Langlois wrote: > On Tue, 2024-08-13 at 12:33 -0600, Jens Axboe wrote: >> On 8/13/24 10:44 AM, Olivier Langlois wrote: >>> the actual napi tracking strategy is inducing a non-negligeable >>> overhead. >>> Everytime a multishot poll is triggered or any poll armed, if the >>> napi is >>> enabled on the ring a lookup is performed to either add a new napi >>> id into >>> the napi_list or its timeout value is updated. >>> >>> For many scenarios, this is overkill as the napi id list will be >>> pretty >>> much static most of the time. To address this common scenario, a >>> new >>> abstraction has been created following the common Linux kernel >>> idiom of >>> creating an abstract interface with a struct filled with function >>> pointers. >>> >>> Creating an alternate napi tracking strategy is therefore made in 2 >>> phases. >>> >>> 1. Introduce the io_napi_tracking_ops interface >>> 2. Implement a static napi tracking by defining a new >>> io_napi_tracking_ops >> >> I don't think we should create ops for this, unless there's a strict >> need to do so. Indirect function calls aren't cheap, and the CPU side >> mitigations for security issues made them worse. >> >> You're not wrong that ops is not an uncommon idiom in the kernel, but >> it's a lot less prevalent as a solution than it used to. Exactly >> because >> of the above reasons. >> > ok. Do you have a reference explaining this? > and what type of construct would you use instead? See all the spectre nonsense, and the mitigations that followed from that. > AFAIK, a big performance killer is the branch mispredictions coming > from big switch/case or if/else if/else blocks and it was precisely the > reason why you removed the big switch/case io_uring was having with > function pointers in io_issue_def... For sure, which is why io_uring itself ended up using indirect function calls, because the table just became unwieldy. But that's a different case from adding it for just a single case, or two. For those, branch prediction should be fine, as it would always have the same outcome. > I consumme an enormous amount of programming learning material daily > and this is the first time that I am hearing this. The kernel and backend programming are a bit different in that regard, for better or for worse. > If there was a performance concern about this type of construct and > considering that my main programming language is C++, I am bit > surprised that I have not seen anything about some problems with C++ > vtbls... It's definitely slower than a direct function call, regardless of whether this is in the kernel or not. Can be mitigated by having the common case be predicted with a branch. See INDIRECT_CALL_*() in the kernel. > but oh well, I am learning new stuff everyday, so please share the > references you have about the topic so that I can perfect my knowledge. I think lwn had a recent thing on indirect function calls as it pertains to the security modules, I'd check that first. But the spectre thing above is likely all you need! -- Jens Axboe