On 7/16/23 8:05 PM, Mina Almasry wrote: >> >> For the driver and hardware queue: don't you need a dedicated queue for >> the flow(s) in question? > > In the RFC and the implementation I'm thinking of, the queue is > 'dedicated' in that each queue will be a devmem TCP queue or a regular > queue. devmem queues generate devmem skbs and non-devmem queues > generate non-devmem skbs. We support switching queues between devmem > mode and non-devmem mode via a uapi. ethtool APIs or something else? > >> If not, how can you properly handle the >> teardown case (e.g., app crashes and you need to ensure all references >> to GPU memory are removed from NIC descriptors)? > > Jason and Christian will correct me if I'm wrong, but AFAICT the > dma-buf API requires the dma-buf provider to keep the attachment > mapping alive as long as the importer requires it. The dma-buf API > gives the importer dma_buf_map_attachment() and > dma_buf_unmap_attachment() APIs, but there is no callback for the > exporter to inform the importer that it has to take the mapping away. Isn't the importer that application that terminated (cleanly or other)? That was my thinking but I guess there are other designs that can cross a single application. > The closest thing I saw was the move_notify() callback, but that is > optional. > > In my mind the way it works is that there will be some uapi that binds > a dma-buf to an RX queue, that will create the attachment and the > mapping. If the user crashes or closes the dma-buf handle then that > will unbind the dma-buf from the RX queue, but the mapping will remain > alive (via some refcounting) until all the NIC descriptors are freed > and the mapping is not under use anymore. Usually this will happen > next driver reset which destroys and recreates rx queues thereby > freeing all the NIC descriptors (but could be a new API so that we > don't rely on a driver reset). > >> If you agree on this >> point, then you can require the dedicated queue management in the driver >> to use and expect only the alternative frag addressing scheme. ie., it >> knows the address is not struct page (validates by checking skb flag or >> frag flag or address magic), but a reference to say a page_pool entry >> (if you are using page_pool for management of the dmabuf slices) which >> contains the metadata needed for the use case. > > Honestly if my understanding above doesn't match what you want, I > could implement 'dedicated queues' instead, just let me know what you > want at some future iteration. Now, I'm more worried about this memory > format issue and I'm working on an RX prototype without struct pages. > So far purely technically speaking it seems possible. > > My comment was only a suggestion on how to simplify driver changes. ie., a queue is either pages (based on standard page_pool or alloc_pages) or some "special" page_pool (ie., new abstraction) but not mixed. In that case it knows how to handle the overloaded 'address' in skb_frag in a clean manner.