On Fri, Jan 05, 2024 at 01:55:11PM +0800, Junxian Huang wrote: > > > On 2024/1/5 4:29, Jason Gunthorpe wrote: > > On Tue, Dec 26, 2023 at 05:16:33PM +0800, Junxian Huang wrote: > >> > >> > >> On 2023/12/26 16:52, Leon Romanovsky wrote: > >>> On Mon, Dec 25, 2023 at 03:53:28PM +0800, Junxian Huang wrote: > >>>> From: Chengchang Tang <tangchengchang@xxxxxxxxxx> > >>>> > >>>> In the current implementation, a fixed page size is used to > >>>> configure the PBL, which is not flexible enough and is not > >>>> conducive to the performance of the HW. > >>>> > >>>> Signed-off-by: Chengchang Tang <tangchengchang@xxxxxxxxxx> > >>>> Signed-off-by: Junxian Huang <huangjunxian6@xxxxxxxxxxxxx> > >>>> --- > >>>> drivers/infiniband/hw/hns/hns_roce_alloc.c | 6 - > >>>> drivers/infiniband/hw/hns/hns_roce_device.h | 9 ++ > >>>> drivers/infiniband/hw/hns/hns_roce_mr.c | 168 +++++++++++++++----- > >>>> 3 files changed, 139 insertions(+), 44 deletions(-) > >>> > >>> I'm wonder if the ib_umem_find_best_pgsz() API should be used instead. > >>> What is missing there? > >>> > >>> Thanks > >> > >> Actually this API is used for umem. > >> For kmem, we add hns_roce_find_buf_best_pgsz() to do a similar job. > > > > But why do you need to do something like this for kmem? It looked to > > me like kmem knows its allocation size when it was allocated, how come > > you need to iterate over all of it again? > > > > Jason > > > > kmem was split into multiple small pages for allocation to prevent allocation > failure due to memory fragmentation. > > And now we add this function to confirm whether these small pages have contiguous > address. If so, they can be combined into one huge page for use, which is more > likely when iommu/smmu is enabled. That seems unncessary. The chances you get contiguous pages from a lot of random allocations is really slim. If you care about this optimization then you should have the allocator explicitly request contiguous pages with high order allocations in a manner that quickly fails and falls back to PAGE_SIZE. Then you just use the size that the allocator was able to get, not try to figure it out after the fact. Jason