On 2020/12/9 22:42, Eric Farman wrote: > > > On 12/9/20 6:54 AM, Cornelia Huck wrote: >> On Tue, 8 Dec 2020 21:55:53 +0800 >> "xuxiaoyang (C)" <xuxiaoyang2@xxxxxxxxxx> wrote: >> >>> On 2020/11/21 15:58, xuxiaoyang (C) wrote: >>>> vfio_pin_pages() accepts an array of unrelated iova pfns and processes >>>> each to return the physical pfn. When dealing with large arrays of >>>> contiguous iovas, vfio_iommu_type1_pin_pages is very inefficient because >>>> it is processed page by page.In this case, we can divide the iova pfn >>>> array into multiple continuous ranges and optimize them. For example, >>>> when the iova pfn array is {1,5,6,7,9}, it will be divided into three >>>> groups {1}, {5,6,7}, {9} for processing. When processing {5,6,7}, the >>>> number of calls to pin_user_pages_remote is reduced from 3 times to once. >>>> For single page or large array of discontinuous iovas, we still use >>>> vfio_pin_page_external to deal with it to reduce the performance loss >>>> caused by refactoring. >>>> >>>> Signed-off-by: Xiaoyang Xu <xuxiaoyang2@xxxxxxxxxx> >> >> (...) >> >>> >>> hi Cornelia Huck, Eric Farman, Zhenyu Wang, Zhi Wang >>> >>> vfio_pin_pages() accepts an array of unrelated iova pfns and processes >>> each to return the physical pfn. When dealing with large arrays of >>> contiguous iovas, vfio_iommu_type1_pin_pages is very inefficient because >>> it is processed page by page. In this case, we can divide the iova pfn >>> array into multiple continuous ranges and optimize them. I have a set >>> of performance test data for reference. >>> >>> The patch was not applied >>> 1 page 512 pages >>> no huge pages: 1638ns 223651ns >>> THP: 1668ns 222330ns >>> HugeTLB: 1526ns 208151ns >>> >>> The patch was applied >>> 1 page 512 pages >>> no huge pages 1735ns 167286ns >>> THP: 1934ns 126900ns >>> HugeTLB: 1713ns 102188ns >>> >>> As Alex Williamson said, this patch lacks proof that it works in the >>> real world. I think you will have some valuable opinions. >> >> Looking at this from the vfio-ccw angle, I'm not sure how much this >> would buy us, as we deal with IDAWs, which are designed so that they >> can be non-contiguous. I guess this depends a lot on what the guest >> does. > > This would be my concern too, but I don't have data off the top of my head to say one way or another... > >> >> Eric, any opinion? Do you maybe also happen to have a test setup that >> mimics workloads actually seen in the real world? >> > > ...I do have some test setups, which I will try to get some data from in a couple days. At the moment I've broken most of those setups trying to implement some other stuff, and can't revert back at the moment. Will get back to this. > > Eric > . Thank you for your reply. Looking forward to your test data. Regards, Xu