Dave Jiang <dave.jiang@xxxxxxxxx> writes: > Adding DMA support for pmem blk reads. This provides signficant CPU > reduction with large memory reads with good performance. DMAs are triggered > with test against bio_multiple_segment(), so the small I/Os (4k or less?) > are still performed by the CPU in order to reduce latency. By default > the pmem driver will be using blk-mq with DMA. > > Numbers below are measured against pmem simulated via DRAM using > memmap=NN!SS. DMA engine used is the ioatdma on Intel Skylake Xeon > platform. Keep in mind the performance for actual persistent memory > will differ. > Fio 2.21 was used. > > 64k: 1 task queuedepth=1 > CPU Read: 7631 MB/s 99.7% CPU DMA Read: 2415 MB/s 54% CPU > CPU Write: 3552 MB/s 100% CPU DMA Write 2173 MB/s 54% CPU > > 64k: 16 tasks queuedepth=16 > CPU Read: 36800 MB/s 1593% CPU DMA Read: 29100 MB/s 607% CPU > CPU Write 20900 MB/s 1589% CPU DMA Write: 23400 MB/s 585% CPU > > 2M: 1 task queuedepth=1 > CPU Read: 6013 MB/s 99.3% CPU DMA Read: 7986 MB/s 59.3% CPU > CPU Write: 3579 MB/s 100% CPU DMA Write: 5211 MB/s 58.3% CPU > > 2M: 16 tasks queuedepth=16 > CPU Read: 18100 MB/s 1588% CPU DMA Read: 21300 MB/s 180.9% CPU > CPU Write: 14100 MB/s 1594% CPU DMA Write: 20400 MB/s 446.9% CPU > > Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx> > --- Hi Dave, The above table shows that there's a performance benefit for 2M transfers but a regression for 64k transfers, if we forget about the CPU utilization for a second. Would it be beneficial to have heuristics on the transfer size that decide when to use dma and when not? You introduced this hunk: - rc = pmem_handle_cmd(cmd); + if (cmd->chan && bio_multiple_segments(req->bio)) + rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req))); + else + rc = pmem_handle_cmd(cmd); Which utilizes dma for bios with multiple segments and for single segment bios you use the old path, maybe the single/multi segment logic can be amended to have something like: if (cmd->chan && bio_segments(req->bio) > PMEM_DMA_THRESH) rc = pmem_handle_cmd_dma(cmd, op_is_write(req_op(req)); else rc = pmem_handle_cmd(cmd); Just something woth considering IMHO. > + len = blk_rq_payload_bytes(req); > + page = virt_to_page(pmem_addr); > + off = (u64)pmem_addr & ~PAGE_MASK; off = offset_in_page(pmem_addr); ? -- Johannes Thumshirn Storage jthumshirn@xxxxxxx +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html