On Wed, Jul 10, 2024 at 12:55 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Wed, 10 Jul 2024 12:29:58 -0700 Mina Almasry wrote: > > On Wed, Jul 10, 2024 at 9:37 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > > On Wed, 10 Jul 2024 00:17:37 +0000 Mina Almasry wrote: > > > > + net_devmem_dmabuf_binding_get(binding); > > > > > > Why does every iov need to hold a ref? pp holds a ref and does its own > > > accounting, so it won't disappear unless all the pages are returned. > > > > I guess it doesn't really need to, but this is the design/approach I > > went with, and I actually prefer it a bit. The design is borrowed from > > how struct dev_pagemap does this, IIRC. Every page allocated from the > > pgmap holds a reference to the pgmap to ensure the pgmap doesn't go > > away while some page that originated from it is out in the wild, and > > similarly I did so in the binding here. > > Oh, you napi_pp_put_page() on the other end! I can see how that could > be fine. > > > We could assume that the page_pool is accounting iovs for us, but that > > is not always true, right? page_pool_return_page() disconnects a > > netmem from the page_pool and AFAIU the page_pool can go away while > > there is such a netmem still in use in the net stack. Currently this > > can't happen with iovs because I currently don't support non-pp > > refcounting for iovs (so they're always recyclable), but you have a > > comment on the other patch asking why that works; depending on how we > > converge on that conversation, the details of how the pp refcounting > > could change. > > Even then - we could take the ref as the page "leaks" out of the pool, > rather than doing it on the fast path, right? Or just BUG_ON() 'cause > that reference ain't coming back ;) > OK, I'll see how the conversation on the other thread converges vis-a-vis net_iov refcounting happens, and then look at if I can avoid the binding_get/put per page in that framework. > > It's nice to know that the binding refcounting will work regardless of > > the details of how the pp refcounting works. IMHO having the binding > > rely on the pp refcounting to ensure all the iovs are freed introduces > > some fragility. > > > > Additionally IMO the net_devmem_dmabuf_binding_get/put aren't so > > expensive to want to optimize out, right? The allocation is a slow > > path anyway and the fast path recycles netmem. > > Yes, I should have read patch 10. I think it's avoidable :) but with > recycling it can indeed perform just fine (do you happen to have > recycling rate stats from prod runs?) I don't to be honest. For a couple of reasons, one is that gcloud VMs where we mainly use this, these stats are private to the VM and is not something I can query widly. I only get access to the data when shared with bug reports on specific issues. In our internal test runs, I do not monitor the recycling rate to be honest, as that is fine as long as the recycling is fast enough to find available memory for incoming data. What I do look at very closely is the allocation failure rate. That is when GVE tries to alloc a new devmem but it's out of devmem (which would likely be due to recycling not happening fast enough). The stat is `page_alloc_fail` in ethtool -S for us and it's one of the first things I check when things go wrong. It hasn't been the root cause for any of our issues in reality. -- Thanks, Mina