On Tue, Apr 30, 2024 at 11:55 AM Jens Axboe <axboe@xxxxxxxxx> wrote: > > On 4/30/24 12:29 PM, Mina Almasry wrote: > > On Tue, Apr 30, 2024 at 6:46?AM Jens Axboe <axboe@xxxxxxxxx> wrote: > >> > >> On 4/26/24 8:11 PM, Mina Almasry wrote: > >>> On Fri, Apr 26, 2024 at 5:18?PM David Wei <dw@xxxxxxxxxxx> wrote: > >>>> > >>>> On 2024-04-02 5:20 pm, Mina Almasry wrote: > >>>>> @@ -69,20 +106,26 @@ net_iov_binding(const struct net_iov *niov) > >>>>> */ > >>>>> typedef unsigned long __bitwise netmem_ref; > >>>>> > >>>>> +static inline bool netmem_is_net_iov(const netmem_ref netmem) > >>>>> +{ > >>>>> +#if defined(CONFIG_PAGE_POOL) && defined(CONFIG_DMA_SHARED_BUFFER) > >>>> > >>>> I am guessing you added this to try and speed up the fast path? It's > >>>> overly restrictive for us since we do not need dmabuf necessarily. I > >>>> spent a bit too much time wondering why things aren't working only to > >>>> find this :( > >>> > >>> My apologies, I'll try to put the changelog somewhere prominent, or > >>> notify you when I do something that I think breaks you. > >>> > >>> Yes, this is a by-product of a discussion with regards to the > >>> page_pool benchmark regressions due to adding devmem. There is some > >>> background on why this was added and the impact on the > >>> bench_page_pool_simple tests in the cover letter. > >>> > >>> For you, I imagine you want to change this to something like: > >>> > >>> #if defined(CONFIG_PAGE_POOL) > >>> #if defined(CONFIG_DMA_SHARED_BUFFER) || defined(CONFIG_IOURING) > >>> > >>> or something like that, right? Not sure if this is something I should > >>> do here or if something more appropriate to be in the patches you > >>> apply on top. > >> > >> In general, attempting to hide overhead behind config options is always > >> a losing proposition. It merely serves to say "look, if these things > >> aren't enabled, the overhead isn't there", while distros blindly enable > >> pretty much everything and then you're back where you started. > >> > > > > The history there is that this check adds 1 cycle regression to the > > page_pool fast path benchmark. The regression last I measured is 8->9 > > cycles, so in % wise it's a quite significant 12.5% (more details in > > the cover letter[1]). I doubt I can do much better than that to be > > honest. > > I'm all for cycle counting, and do it myself too, but is that even > measurable in anything that isn't a super targeted microbenchmark? Or > even in that? > Not as far as I can tell, no. This was purely to improve the page_pool benchmark. > > There was a desire not to pay this overhead in setups that will likely > > not care about devmem, like embedded devices maybe, or setups without > > GPUs. Adding a CONFIG check here seemed like very low hanging fruit, > > but yes it just hides the overhead in some configs, not really removes > > it. > > > > There was a discussion about adding this entire netmem/devmem work > > under a new CONFIG. There was pushback particularly from Willem that > > at the end of the day what is enabled on most distros is what matters > > and we added code churn and CONFIG churn for little value. > > > > If there is significant pushback to the CONFIG check I can remove it. > > I don't feel like it's critical, it just mirco-optimizes some setups > > that doesn't really care about this work area. > > That is true, but in practice it'll be enabled anyway. Seems like it's > not really worth it in this scenario. > OK, no pushback from me. I'll remove the CONFIG check in the next iteration. -- Thanks, Mina