On 4/30/24 12:29 PM, Mina Almasry wrote: > On Tue, Apr 30, 2024 at 6:46?AM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 4/26/24 8:11 PM, Mina Almasry wrote: >>> On Fri, Apr 26, 2024 at 5:18?PM David Wei <dw@xxxxxxxxxxx> wrote: >>>> >>>> On 2024-04-02 5:20 pm, Mina Almasry wrote: >>>>> @@ -69,20 +106,26 @@ net_iov_binding(const struct net_iov *niov) >>>>> */ >>>>> typedef unsigned long __bitwise netmem_ref; >>>>> >>>>> +static inline bool netmem_is_net_iov(const netmem_ref netmem) >>>>> +{ >>>>> +#if defined(CONFIG_PAGE_POOL) && defined(CONFIG_DMA_SHARED_BUFFER) >>>> >>>> I am guessing you added this to try and speed up the fast path? It's >>>> overly restrictive for us since we do not need dmabuf necessarily. I >>>> spent a bit too much time wondering why things aren't working only to >>>> find this :( >>> >>> My apologies, I'll try to put the changelog somewhere prominent, or >>> notify you when I do something that I think breaks you. >>> >>> Yes, this is a by-product of a discussion with regards to the >>> page_pool benchmark regressions due to adding devmem. There is some >>> background on why this was added and the impact on the >>> bench_page_pool_simple tests in the cover letter. >>> >>> For you, I imagine you want to change this to something like: >>> >>> #if defined(CONFIG_PAGE_POOL) >>> #if defined(CONFIG_DMA_SHARED_BUFFER) || defined(CONFIG_IOURING) >>> >>> or something like that, right? Not sure if this is something I should >>> do here or if something more appropriate to be in the patches you >>> apply on top. >> >> In general, attempting to hide overhead behind config options is always >> a losing proposition. It merely serves to say "look, if these things >> aren't enabled, the overhead isn't there", while distros blindly enable >> pretty much everything and then you're back where you started. >> > > The history there is that this check adds 1 cycle regression to the > page_pool fast path benchmark. The regression last I measured is 8->9 > cycles, so in % wise it's a quite significant 12.5% (more details in > the cover letter[1]). I doubt I can do much better than that to be > honest. I'm all for cycle counting, and do it myself too, but is that even measurable in anything that isn't a super targeted microbenchmark? Or even in that? > There was a desire not to pay this overhead in setups that will likely > not care about devmem, like embedded devices maybe, or setups without > GPUs. Adding a CONFIG check here seemed like very low hanging fruit, > but yes it just hides the overhead in some configs, not really removes > it. > > There was a discussion about adding this entire netmem/devmem work > under a new CONFIG. There was pushback particularly from Willem that > at the end of the day what is enabled on most distros is what matters > and we added code churn and CONFIG churn for little value. > > If there is significant pushback to the CONFIG check I can remove it. > I don't feel like it's critical, it just mirco-optimizes some setups > that doesn't really care about this work area. That is true, but in practice it'll be enabled anyway. Seems like it's not really worth it in this scenario. -- Jens Axboe