On Tue, Apr 12, 2016 at 08:16:49AM +0200, Jesper Dangaard Brouer wrote: > > On Mon, 11 Apr 2016 15:21:26 -0700 > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > > On Mon, Apr 11, 2016 at 11:41:57PM +0200, Jesper Dangaard Brouer wrote: > > > > > > On Sun, 10 Apr 2016 21:45:47 +0300 Sagi Grimberg <sagi@xxxxxxxxxxx> wrote: > > > > [...] > > > > > > > > If we go down this road how about also attaching some driver opaques > > > > to the page sets? > > > > > > That was the ultimate plan... to leave some opaques bytes left in the > > > page struct that drivers could use. > > > > > > In struct page I would need a pointer back to my page_pool struct and a > > > page flag. Then, I would need room to store the dma_unmap address. > > > (And then some of the usual fields are still needed, like the refcnt, > > > and reusing some of the list constructs). And a zero-copy cross-domain > > > id. > > > > I don't think we need to add anything to struct page. > > This is supposed to be small cache of dma_mapped pages with lockless access. > > It can be implemented as an array or link list where every element > > is dma_addr and pointer to page. If it is full, dma_unmap_page+put_page to > > send it to back to page allocator. > > It sounds like the Intel drivers recycle facility, where they split the > page into two parts, and keep page in RX-ring, by swapping to other > half of page, if page_count(page) is <= 2. Thus, they use the atomic > page ref count to synchronize on. actually I'm proposing the opposite. one page = one packet. I'm perfectly happy to waste half a page, since number of such pages is small and performance matter more. Typical performance vs memory tradeoff. > Thus, we end-up having two atomic operations per RX packet, on the page > refcnt. Where DPDK have zero... the page recycling cache should have zero atomic ops per packet otherwise it's non starter. > By fully taking over the page as an allocator, almost like slab. I can > optimize the common case (of the packet-page getting allocated and > free'ed on the same CPU), and remove these atomic operations. slub is doing local cmpxchg. 40G networking cannot afford it per packet. If it's amortized due to batching that will be ok. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>