On Tue, Jul 25, 2023, 7:31 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
On Tue, Jul 25, 2023 at 07:15:08AM -0700, Suren Baghdasaryan wrote:
> On Tue, Jul 25, 2023 at 5:58 AM Conor Dooley <conor.dooley@xxxxxxxxxxxxx> wrote:
> >
> > Hey,
> >
> > On Mon, Jul 24, 2023 at 07:54:02PM +0100, Matthew Wilcox (Oracle) wrote:
> > > Remove the TCP layering violation by allowing per-VMA locks on all VMAs.
> > > The fault path will immediately fail in handle_mm_fault(). There may be
> > > a small performance reduction from this patch as a little unnecessary work
> > > will be done on each page fault. See later patches for the improvement.
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> > > Reviewed-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> > > Cc: Arjun Roy <arjunroy@xxxxxxxxxx>
> > > Cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> >
> > Unless my bisection has gone awry, this is causing boot failures for me
> > in today's linux-next w/ a splat like so.
>
> This patch requires [1] to work correctly. It follows the rule
> introduced in [1] that anyone returning VM_FAULT_RETRY should also do
> vma_end_read(). [1] is merged into mm-unstable but has not reached
> linux-next yet, it seems.
No, it's in linux-next, but you didn't fix riscv ...
Andrew, can you add this fix to Suren's patch?
"mm: drop per-VMA lock when returning VM_FAULT_RETRY or VM_FAULT_COMPLETED"
Oops. Not sure how I missed riscv. Yes, please, the fix below is correct.
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 046732fcb48c..6115d7514972 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -296,7 +296,8 @@ void handle_page_fault(struct pt_regs *regs)
}
fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
- vma_end_read(vma);
+ if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
+ vma_end_read(vma);
if (!(fault & VM_FAULT_RETRY)) {
count_vm_vma_lock_event(VMA_LOCK_SUCCESS);