On Mon, Mar 20, 2017 at 09:19:36PM -0500, Shiraz Saleem wrote: > On Sun, Mar 19, 2017 at 10:55:56AM +0200, Leon Romanovsky wrote: > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > > > IB umem structure has field hugetlb which was assigned, but except > > i40iw driver, it was actually never used. > > > > This patch drops that variable out of the struct ib_umem, removes > > all writes to it, get rids of VMA allocation which wasn't used and > > removes check hugetlb from i40iw driver, because it is checked anyway in > > i40iw_set_hugetlb_values() routine. > > > > @@ -143,9 +141,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, > > > > umem->odp_data = NULL; > > > > - /* We assume the memory is from hugetlb until proved otherwise */ > > - umem->hugetlb = 1; > > - > > page_list = (struct page **) __get_free_page(GFP_KERNEL); > > if (!page_list) { > > put_pid(umem->pid); > > @@ -153,14 +148,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, > > return ERR_PTR(-ENOMEM); > > } > > > > - /* > > - * if we can't alloc the vma_list, it's not so bad; > > - * just assume the memory is not hugetlb memory > > - */ > > - vma_list = (struct vm_area_struct **) __get_free_page(GFP_KERNEL); > > - if (!vma_list) > > - umem->hugetlb = 0; > > - > > npages = ib_umem_num_pages(umem); > > > > down_write(¤t->mm->mmap_sem); > > @@ -194,7 +181,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, > > ret = get_user_pages(cur_base, > > min_t(unsigned long, npages, > > PAGE_SIZE / sizeof (struct page *)), > > - gup_flags, page_list, vma_list); > > + gup_flags, page_list, NULL); > > > > if (ret < 0) > > goto out; > > @@ -203,12 +190,8 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, > > cur_base += ret * PAGE_SIZE; > > npages -= ret; > > > > - for_each_sg(sg_list_start, sg, ret, i) { > > - if (vma_list && !is_vm_hugetlb_page(vma_list[i])) > > - umem->hugetlb = 0; > > - > > + for_each_sg(sg_list_start, sg, ret, i) > > sg_set_page(sg, page_list[i], PAGE_SIZE, 0); > > - } > > > > /* preparing for next loop */ > > sg_list_start = sg; > > > diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c > > index 9b2849979756..59c8d74e3704 100644 > > --- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c > > +++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c > > @@ -1850,7 +1850,7 @@ static struct ib_mr *i40iw_reg_user_mr(struct ib_pd *pd, > > iwmr->page_size = region->page_size; > > iwmr->page_msk = PAGE_MASK; > > > > - if (region->hugetlb && (req.reg_type == IW_MEMREG_TYPE_MEM)) > > + if (req.reg_type == IW_MEMREG_TYPE_MEM) > > i40iw_set_hugetlb_values(start, iwmr); > > The code in ib_umem_get iterates through __all__ VMA’s corresponding to the > user-space buffer and hugetlb field of the umem struct is set only if they > are all using huge_pages. So in the essence, umem->hugetlb provides the > driver info on whether __all__ pages of the memory region uses huge pages > or not. > > i40iw_set_hugetlb_values only finds the VMA w.r.t the start virtual address of > the user-space buffer and checks if __this__ VMA uses hugetlb pages or not. > It is a sanity check done on the VMA found. > > So the question is -- Do we know that only a single VMA represents the user-space > buffer backed by huge pages? > If not, it doesnt appear that the check in i40iw_set_hugetlb_values will suffice to > identify if the entire memory region is composed of huge pages since it doesnt > check all the VMAs. umem->hugetlb would be required for it. You are right and thank you for pointing that. User memory buffer indeed can span more than one VMA. For example one reg_mr with READ property for two adjacent VAMs, one with READ properties and another with READ-WRITE. However, relying on hugetlb which was set during creation is not best thing too. It can be changed during the application run. The mlx5 relies on the mlx5_ib_cont_pages() function to understand the actual layout. Thanks > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature