On Thu, Apr 20, 2023 at 03:59:39PM +0200, Alexander Lobakin wrote: > Hmm, currently almost all Ethernet drivers map Rx pages once and then > just recycle them, keeping the original DMA mapping. Which means pages > can have the same first mapping for very long time, often even for the > lifetime of the struct device. Same for XDP sockets, the lifetime of DMA > mappings equals the lifetime of sockets. > Does it mean we'd better review that approach and try switching to > dma_alloc_*() family (non-coherent/caching in our case)? Yes, exactly. dma_alloc_noncoherent can be used exactly as alloc_pages + dma_map_* by the driver (including the dma_sync_* calls on reuse), but has a huge number of advantages. > Also, I remember I tried to do that for one my driver, but the thing > that all those functions zero the whole page(s) before returning them to > the driver ruins the performance -- we don't need to zero buffers for > receiving packets and spend a ton of cycles on it (esp. in cases when 4k > gets zeroed each time, but your main body of traffic is 64-byte frames). Hmm, the single zeroing when doing the initial allocation shows up in these profiles?