On 2024/1/5 22:18, Jason Gunthorpe wrote: > On Fri, Jan 05, 2024 at 01:55:11PM +0800, Junxian Huang wrote: >> >> >> On 2024/1/5 4:29, Jason Gunthorpe wrote: >>> On Tue, Dec 26, 2023 at 05:16:33PM +0800, Junxian Huang wrote: >>>> >>>> >>>> On 2023/12/26 16:52, Leon Romanovsky wrote: >>>>> On Mon, Dec 25, 2023 at 03:53:28PM +0800, Junxian Huang wrote: >>>>>> From: Chengchang Tang <tangchengchang@xxxxxxxxxx> >>>>>> >>>>>> In the current implementation, a fixed page size is used to >>>>>> configure the PBL, which is not flexible enough and is not >>>>>> conducive to the performance of the HW. >>>>>> >>>>>> Signed-off-by: Chengchang Tang <tangchengchang@xxxxxxxxxx> >>>>>> Signed-off-by: Junxian Huang <huangjunxian6@xxxxxxxxxxxxx> >>>>>> --- >>>>>> drivers/infiniband/hw/hns/hns_roce_alloc.c | 6 - >>>>>> drivers/infiniband/hw/hns/hns_roce_device.h | 9 ++ >>>>>> drivers/infiniband/hw/hns/hns_roce_mr.c | 168 +++++++++++++++----- >>>>>> 3 files changed, 139 insertions(+), 44 deletions(-) >>>>> >>>>> I'm wonder if the ib_umem_find_best_pgsz() API should be used instead. >>>>> What is missing there? >>>>> >>>>> Thanks >>>> >>>> Actually this API is used for umem. >>>> For kmem, we add hns_roce_find_buf_best_pgsz() to do a similar job. >>> >>> But why do you need to do something like this for kmem? It looked to >>> me like kmem knows its allocation size when it was allocated, how come >>> you need to iterate over all of it again? >>> >>> Jason >>> >> >> kmem was split into multiple small pages for allocation to prevent allocation >> failure due to memory fragmentation. >> >> And now we add this function to confirm whether these small pages have contiguous >> address. If so, they can be combined into one huge page for use, which is more >> likely when iommu/smmu is enabled. > > That seems unncessary. The chances you get contiguous pages from > a lot of random allocations is really slim. > > If you care about this optimization then you should have the allocator > explicitly request contiguous pages with high order allocations in a > manner that quickly fails and falls back to PAGE_SIZE. > > Then you just use the size that the allocator was able to get, not try > to figure it out after the fact. > > Jason Thanks for the suggestion! We'll consider a new implementation. For this patch, I'll drop the kmem modification and re-send a v2. Junxian