On Thu, Feb 14, 2019 at 12:23 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > On Wed, Feb 13, 2019 at 11:58:24PM +0530, Devesh Sharma wrote: > > On Wed, Feb 13, 2019 at 10:58 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > > > On Wed, Feb 13, 2019 at 10:44:47PM +0530, Devesh Sharma wrote: > > > > > @@ -116,12 +116,9 @@ static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl, > > > > > } else { > > > > > i = 0; > > > > > is_umem = true; > > > > > - for_each_sg(sghead, sg, pages, i) { > > > > > - pbl->pg_map_arr[i] = sg_dma_address(sg); > > > > > - pbl->pg_arr[i] = sg_virt(sg); > > > > > - if (!pbl->pg_arr[i]) > > > > > - goto fail; > > > > > - > > > > > + for_each_sg_dma_page(sghead, &sg_iter, pages, 0) { > > > > > + pbl->pg_map_arr[i] = sg_page_iter_dma_address(&sg_iter); > > > > > + i++; > > > > > pbl->pg_count++; > > > > The thing is we need the virtual address for future reference in this > > > > path. If we do not have virt-addr populated it breaks! > > > > > > Drivers should not use both DMA and struct page for umems. One other > > > other. > > > > > > Where is the pg_arr read in the umem case, and what is it for? > > Right there at the caller of __alloc_pbl(). > > What I see is a confusing mess where sometimes the pg_arr points to > dma_alloc_coherent value that needs to be freed, and sometimes points > to a struct page pointing into user memory. rc = __alloc_pbl(pdev, &hwq->pbl[PBL_LVL_2], sghead, pages, pg_size); if (rc) goto fail; /* Fill in lvl1 PBL */ dst_virt_ptr = (dma_addr_t **)hwq->pbl[PBL_LVL_1].pg_arr; src_phys_ptr = hwq->pbl[PBL_LVL_2].pg_map_arr; for (i = 0; i < hwq->pbl[PBL_LVL_2].pg_count; i++) { dst_virt_ptr[PTR_PG(i)][PTR_IDX(i)] = src_phys_ptr[i] | PTU_PTE_VALID; } Here the driver code is trying to fill the PTEs to the page tables data-structure needed by Broadcom devices. I agree that that code deserves careful cleanup. We can supply patch series for that. There are couple of other places too where the driver tries to access pg_arr even though the object belong to user space. One such example is __clean_cq() during destroy QP. > > All the cases that touch pg_arr seem to want to touch the > dma_alloc_coherent memory, not the user pages, so setting it to NULL > for the usage pages seems like the right thing to do? I agree that in user-space object case setting pg_arr to NULL should be okay. However the driver is broken in current condition. It immediately requires additional patch to restrict unconditional access of pg_arr data structure when the object belong to user-space. The fix would be to put check if any object belongs to user space then skip calling the functions which require pg_arr. i.e. __clean_cq() etc. > > Jason