On 11/11/2022 17:22, Daisuke Matsuda wrote: > ib_umem_odp_map_dma_single_page(), which has been used only by the mlx5 > driver, holds umem_mutex on success and releases on failure. This > behavior is not convenient for other drivers to use it, so change it to > always retain mutex on return. > > Signed-off-by: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx> > --- > drivers/infiniband/core/umem_odp.c | 8 +++----- > drivers/infiniband/hw/mlx5/odp.c | 4 +++- > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c > index e9fa22d31c23..49da6735f7c8 100644 > --- a/drivers/infiniband/core/umem_odp.c > +++ b/drivers/infiniband/core/umem_odp.c > @@ -328,8 +328,8 @@ static int ib_umem_odp_map_dma_single_page( > * > * Maps the range passed in the argument to DMA addresses. > * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. > - * Upon success the ODP MR will be locked to let caller complete its device > - * page table update. > + * The umem mutex is locked in this function. Callers are responsible for > + * releasing the lock. > * > * Returns the number of pages mapped in success, negative error code > * for failure. > @@ -453,11 +453,9 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, > break; > } > } > - /* upon success lock should stay on hold for the callee */ > + > if (!ret) > ret = dma_index - start_idx; > - else > - mutex_unlock(&umem_odp->umem_mutex); > > out_put_mm: > mmput_async(owning_mm); > diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c > index bc97958818bb..a0de27651586 100644 > --- a/drivers/infiniband/hw/mlx5/odp.c > +++ b/drivers/infiniband/hw/mlx5/odp.c > @@ -572,8 +572,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, > access_mask |= ODP_WRITE_ALLOWED_BIT; > > np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); > - if (np < 0) > + if (np < 0) { > + mutex_unlock(&odp->umem_mutex); > return np; > + } refer to the comments of ib_umem_odp_map_dma_and_lock: 334 * Returns the number of pages mapped in success, negative error code 335 * for failure. I don't think it's correct to release the lock in all failure case, for example when it reaches below error path. 346 int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, 347 u64 bcnt, u64 access_mask, bool fault) 348 __acquires(&umem_odp->umem_mutex) 349 { 350 struct task_struct *owning_process = NULL; 351 struct mm_struct *owning_mm = umem_odp->umem.owning_mm; 352 int pfn_index, dma_index, ret = 0, start_idx; 353 unsigned int page_shift, hmm_order, pfn_start_idx; 354 unsigned long num_pfns, current_seq; 355 struct hmm_range range = {}; 356 unsigned long timeout; 357 358 if (access_mask == 0) 359 return -EINVAL; <<<<< no lock is hold yet 360 361 if (user_virt < ib_umem_start(umem_odp) || 362 user_virt + bcnt > ib_umem_end(umem_odp)) 363 return -EFAULT; <<<<< no lock is hold yet Further more, you changed public API's the behavior, do it matter for other out-of-tree drivers which is using it, i'm not familiar with this, maybe kernel has no restriction on it ? > > /* > * No need to check whether the MTTs really belong to this MR, since