On (03/25/15 21:43), cascardo@xxxxxxxxxxxxxxxxxx wrote: > However, when using large TCP send/recv (I used uperf with 64KB > writes/reads), I noticed that on the transmit side, largealloc is not > used, but on the receive side, cxgb4 almost only uses largealloc, while > qlge seems to have a 1/1 usage or largealloc/non-largealloc mappings. > When turning GRO off, that ratio is closer to 1/10, meaning there is > still some fair use of largealloc in that scenario. > > I confess my experiments are not complete. I would like to test a couple > of other drivers as well, including mlx4_en and bnx2x, and test with > small packet sizes. I suspected that MTU size could make a difference, > but in the case of ICMP, with MTU 9000 and payload of 8000 bytes, I > didn't notice any significant hit of largepool with either qlge or > cxgb4. I guess we also need to consider the "average use-case", i.e., something that interleaves small packets and interactive data with jumbo/bulk data.. in those cases, the largepool would not get many hits, and might actually be undesirable? > But I believe that on the receive side, all drivers should map entire > pages, using some allocation strategy similar to mlx4_en, in order to > avoid DMA mapping all the time. good point. I think in the early phase of my perf investigation, it was brought up that Solaris does pre-mapped DMA buffers (they have to do this carefully, to avoid resource-starvation vulnerabilities- see http://www.spinics.net/lists/sparclinux/msg13217.html and threads leading to it.. This is not something that the common iommu-arena allocator can/should get involved in, of course. The scope of the arena-allocator is much more rigorously defined. I dont know if there is a way to set up a generalized DMA premapped buffer infra for linux today. fwiw, when I instrumented this for solaris (there are hooks to disable the pre-mapped bufferes) the impact on a T5-2 (8 sockets, 2 numa nodes, 64 cpus) was not very significant for a single 10G ixgbe port- approx 8 Gbps instead of 9.X Gbps. I think the DMA buffer pre-mapping is only significant when you start trying to scale to multiple ethernet ports. --Sowmini -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html