Re: [PATCH mm/bpf v2] mm: bpf: add find_vma_no_check() without lockdep_assert on mm->mmap_lock

Yonghong Song <yhs@xxxxxx> · Wed, 8 Sep 2021 12:11:54 -0700






On 9/8/21 11:49 AM, Jason Gunthorpe wrote:
On Wed, Sep 08, 2021 at 06:30:52PM +0000, Liam Howlett wrote:

  /* Look up the first VMA which satisfies  addr < vm_end,  NULL if none. */
-struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
+struct vm_area_struct *find_vma_non_owner(struct mm_struct *mm,
+					 unsigned long addr)
  {
  	struct rb_node *rb_node;
  	struct vm_area_struct *vma;
  
-	mmap_assert_locked(mm);
+	VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm);
  	/* Check the cache first. */
  	vma = vmacache_find(mm, addr);
  	if (likely(vma))
@@ -2325,6 +2326,11 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
  	return vma;
  }
  
+struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
+{
+	lockdep_assert_held(&mm->mmap_lock);
+	return find_vma_non_owner(mm, addr);
+}
  EXPORT_SYMBOL(find_vma);
  
  /*


Although this leaks more into the mm API and was referred to as ugly
previously, it does provide a working solution and still maintains the
same level of checking.

I think it is no better than before.

The solution must be to not break lockdep in the BPF side. If Peter's
reworked algorithm is not OK then BPF should drop/acquire the lockdep
when it punts the unlock to the WQ.

The current warning is triggered by bpf calling find_vma().
Is it too late even if bpf does drop/acquire the lockdep when it punts
the unlock to the WQ with irq_work_queue()? Maybe I missed something,
could you be more specific about your proposed solution?


'non-owner' is just a nice way of saying 'the caller is messing with
lockdep', it is not a sane way to design APIs

Jason