--- Bernard Metzler, PhD Tech. Leader High Performance I/O, Principal Research Staff IBM Zurich Research Laboratory Saeumerstrasse 4 CH-8803 Rueschlikon, Switzerland +41 44 724 8605 -----"Jason Gunthorpe" <jgg@xxxxxxxx> wrote: ----- >To: "Bernard Metzler" <bmt@xxxxxxxxxxxxxx> >From: "Jason Gunthorpe" <jgg@xxxxxxxx> >Date: 03/08/2019 02:47PM >Cc: linux-rdma@xxxxxxxxxxxxxxx >Subject: Re: [PATCH v5 07/13] SIW application buffer management > >On Tue, Feb 19, 2019 at 11:08:57AM +0100, Bernard Metzler wrote: >> +struct siw_umem *siw_umem_get(u64 start, u64 len) >> +{ >> + struct siw_umem *umem; >> + u64 first_page_va; >> + unsigned long mlock_limit; >> + int num_pages, num_chunks, i, rv = 0; >> + >> + if (!can_do_mlock()) >> + return ERR_PTR(-EPERM); >> + >> + if (!len) >> + return ERR_PTR(-EINVAL); >> + >> + first_page_va = start & PAGE_MASK; >> + num_pages = PAGE_ALIGN(start + len - first_page_va) >> >PAGE_SHIFT; >> + num_chunks = (num_pages >> CHUNK_SHIFT) + 1; >> + >> + umem = kzalloc(sizeof(*umem), GFP_KERNEL); >> + if (!umem) >> + return ERR_PTR(-ENOMEM); >> + >> + umem->pid = get_task_pid(current, PIDTYPE_PID); >> + >> + down_write(¤t->mm->mmap_sem); >> + >> + mlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; >> + >> + if (num_pages + atomic64_read(¤t->mm->pinned_vm) > >mlock_limit) { >> + rv = -ENOMEM; >> + goto out; >> + } >> + umem->fp_addr = first_page_va; >> + >> + umem->page_chunk = kcalloc(num_chunks, sizeof(struct >siw_page_chunk), >> + GFP_KERNEL); >> + if (!umem->page_chunk) { >> + rv = -ENOMEM; >> + goto out; >> + } >> + for (i = 0; num_pages; i++) { >> + int got, nents = min_t(int, num_pages, PAGES_PER_CHUNK); >> + >> + umem->page_chunk[i].p = kcalloc(nents, sizeof(struct page *), >> + GFP_KERNEL); >> + if (!umem->page_chunk[i].p) { >> + rv = -ENOMEM; >> + goto out; >> + } >> + got = 0; >> + while (nents) { >> + struct page **plist = &umem->page_chunk[i].p[got]; >> + >> + rv = get_user_pages(first_page_va, nents, FOLL_WRITE, >> + plist, NULL); >> + if (rv < 0) >> + goto out; >> + >> + umem->num_pages += rv; >> + atomic64_add(rv, ¤t->mm->pinned_vm); >> + first_page_va += rv * PAGE_SIZE; >> + nents -= rv; >> + got += rv; >> + } >> + num_pages -= got; >> + } > >Actually why isn't this just using umem_get? I found it not really optimized for a fast lookup of a page from a vaddr. I had all that umem in before, and it was a mess if one wants to start sending/receiving from/to an address or resume to do so w/o searching through the sg lists again. I am not sure it is as efficient in rxe (didn't check). siw_get_upage() is rather simple in the end and does what I need. > >rxe managed it. IIRC you set the dma_ops properly and the dma_map >does >nothing, so this boils down to the same as umem get. > >Jason > >