On Mon, Jun 12, 2023 at 11:34 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Mon, Jun 12, 2023 at 09:07:38AM -0700, Suren Baghdasaryan wrote: > > On Mon, Jun 12, 2023 at 6:36 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > > > On Fri, Jun 09, 2023 at 06:29:43PM -0700, Suren Baghdasaryan wrote: > > > > On Fri, Jun 9, 2023 at 3:30 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > > > > > > > On Fri, Jun 9, 2023 at 1:42 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > > > > > > > > > On Thu, Jun 08, 2023 at 05:51:56PM -0700, Suren Baghdasaryan wrote: > > > > > > > migration_entry_wait does not need VMA lock, therefore it can be dropped > > > > > > > before waiting. Introduce VM_FAULT_VMA_UNLOCKED to indicate that VMA > > > > > > > lock was dropped while in handle_mm_fault(). > > > > > > > Note that once VMA lock is dropped, the VMA reference can't be used as > > > > > > > there are no guarantees it was not freed. > > > > > > > > > > > > Then vma lock behaves differently from mmap read lock, am I right? Can we > > > > > > still make them match on behaviors, or there's reason not to do so? > > > > > > > > > > I think we could match their behavior by also dropping mmap_lock here > > > > > when fault is handled under mmap_lock (!(fault->flags & > > > > > FAULT_FLAG_VMA_LOCK)). > > > > > I missed the fact that VM_FAULT_COMPLETED can be used to skip dropping > > > > > mmap_lock in do_page_fault(), so indeed, I might be able to use > > > > > VM_FAULT_COMPLETED to skip vma_end_read(vma) for per-vma locks as well > > > > > instead of introducing FAULT_FLAG_VMA_LOCK. I think that was your idea > > > > > of reusing existing flags? > > > > Sorry, I meant VM_FAULT_VMA_UNLOCKED, not FAULT_FLAG_VMA_LOCK in the > > > > above reply. > > > > > > > > I took a closer look into using VM_FAULT_COMPLETED instead of > > > > VM_FAULT_VMA_UNLOCKED but when we fall back from per-vma lock to > > > > mmap_lock we need to retry with an indication that the per-vma lock > > > > was dropped. Returning (VM_FAULT_RETRY | VM_FAULT_COMPLETE) to > > > > indicate such state seems strange to me ("retry" and "complete" seem > > > > > > Not relevant to this migration patch, but for the whole idea I was thinking > > > whether it should just work if we simply: > > > > > > fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs); > > > - vma_end_read(vma); > > > + if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED))) > > > + vma_end_read(vma); > > > > > > ? > > > > Today when we can't handle a page fault under per-vma locks we return > > VM_FAULT_RETRY, in which case per-vma lock is dropped and the fault is > > Oh I see what I missed. I think it may not be a good idea to reuse > VM_FAULT_RETRY just for that, because it makes VM_FAULT_RETRY even more > complicated - now it adds one more case where the lock is not released, > that's when PER_VMA even if !NOWAIT. > > > retried under mmap_lock. The condition you suggest above would not > > drop per-vma lock for VM_FAULT_RETRY case and would break the current > > fallback mechanism. > > However your suggestion gave me an idea. I could indicate that per-vma > > lock got dropped using vmf structure (like Matthew suggested before) > > and once handle_pte_fault(vmf) returns I could check if it returned > > VM_FAULT_RETRY but per-vma lock is still held. > > If that happens I can > > call vma_end_read() before returning from __handle_mm_fault(). That > > way any time handle_mm_fault() returns VM_FAULT_RETRY per-vma lock > > will be already released, so your condition in do_page_fault() will > > work correctly. That would eliminate the need for a new > > VM_FAULT_VMA_UNLOCKED flag. WDYT? > > Sounds good. > > So probably that's the major pain for now with the legacy fallback (I'll > have commented if I noticed it with the initial vma lock support..). I > assume that'll go away as long as we recover the VM_FAULT_RETRY semantics > to before. I think so. With that change getting VM_FAULT_RETRY in do_page_fault() will guarantee that per-vma lock was dropped. Is that what you mean? > > -- > Peter Xu > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx. >