On 7/11/23 2:39 PM, Jakub Kicinski wrote: > On Tue, 11 Jul 2023 10:06:28 -0700 Mina Almasry wrote: >>>> Any reason not to allow an alternative representation for skb frags than >>>> struct page? >>> >>> I don't think there's a hard technical reason. We can make it work. >> >> I also think we can switch the representation for skb frags to >> something else. However - please do correct me if I'm wrong - I don't >> think that is sufficient for device memory TCP. My understanding is >> that we also need to modify any NIC drivers that want to use device >> memory TCP to understand a new memory type, and the page pool as well >> if that's involved. I think in particular modifying the memory type in >> all the NIC drivers that want to do device memory TCP is difficult. Do >> you think this is feasible? > > That's why I was thinking about adding an abstraction between > the page pool and the driver. Instead of feeding driver pages > a new abstraction could feed the driver just an identifier and a PA. skb frag is currently a bio_vec. Overloading the 'struct page' address in that struct with another address is easy to do. Requiring a certain alignment on the address gives you a few low bits to use a flags / magic / etc. Overloading len and offset is not really possible - way too much code is affected (e.g., iov walking and MSS / TSO segmenting). ie., you could overload page address with a pointer to an object in your new abstraction layer and the struct has the other meta data. typedef struct skb_frag { union { struct bio_vec bvec; struct new_abstraction abs; }; } skb_frag_t; where struct new_abstraction { void *addr, unsigned int len; unsigned int offset; }; I have been playing with a similar and it co-exists with the existing code quite well with the constraint on location of len and offset. > > Whether we want to support fragmentation in that model or not would > have to be decided. > > We can take pages from the page pool and feed them to drivers via > such an API, but drivers need to stop expecting pages. yes, drivers would have to be updated to understand the new format. A downside, but again relatively easy to manage. > > That's for data buffers only, obviously. We can keep using pages > and raw page pool for headers. yes.