Re: [PATCH v4 29/33] x86/mm: try VMA lock-based page fault handling first

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 3, 2023 at 8:24 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Mon, Jul 3, 2023 at 2:45 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> >
> > On Mon, Jul 3, 2023 at 6:52 AM Holger Hoffstätte
> > <holger@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On 2023-07-03 12:47, Jiri Slaby wrote:
> > > > Cc Jacob Young (from kernel bugzilla)
> > > >
> > > > On 30. 06. 23, 19:40, Suren Baghdasaryan wrote:
> > > >> On Fri, Jun 30, 2023 at 1:43 AM Jiri Slaby <jirislaby@xxxxxxxxxx> wrote:
> > > >>>
> > > >>> On 30. 06. 23, 10:28, Jiri Slaby wrote:
> > > >>>>   > 2348
> > > >>>> clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fcaa5882990, parent_tid=0x7fcaa5882990, exit_signal=0, stack=0x7fcaa5082000, stack_size=0x7ffe00, tls=0x7fcaa58826c0} => {parent_tid=[2351]}, 88) = 2351
> > > >>>>   > 2350  <... clone3 resumed> => {parent_tid=[2372]}, 88) = 2372
> > > >>>>   > 2351  <... clone3 resumed> => {parent_tid=[2354]}, 88) = 2354
> > > >>>>   > 2351  <... clone3 resumed> => {parent_tid=[2357]}, 88) = 2357
> > > >>>>   > 2354  <... clone3 resumed> => {parent_tid=[2355]}, 88) = 2355
> > > >>>>   > 2355  <... clone3 resumed> => {parent_tid=[2370]}, 88) = 2370
> > > >>>>   > 2370  mmap(NULL, 262144, PROT_READ|PROT_WRITE,
> > > >>>> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > >>>>   > 2370  <... mmap resumed>)               = 0x7fca68249000
> > > >>>>   > 2372  <... clone3 resumed> => {parent_tid=[2384]}, 88) = 2384
> > > >>>>   > 2384  <... clone3 resumed> => {parent_tid=[2388]}, 88) = 2388
> > > >>>>   > 2388  <... clone3 resumed> => {parent_tid=[2392]}, 88) = 2392
> > > >>>>   > 2392  <... clone3 resumed> => {parent_tid=[2395]}, 88) = 2395
> > > >>>>   > 2395  write(2, "runtime: marked free object in s"..., 36 <unfinished
> > > >>>> ...>
> > > >>>>
> > > >>>> I.e. IIUC, all are threads (CLONE_VM) and thread 2370 mapped ANON
> > > >>>> 0x7fca68249000 - 0x7fca6827ffff and go in thread 2395 thinks for some
> > > >>>> reason 0x7fca6824bec8 in that region is "bad".
> > > >>
> > > >> Thanks for the analysis Jiri.
> > > >> Is it possible from these logs to identify whether 2370 finished the
> > > >> mmap operation before 2395 tried to access 0x7fca6824bec8? That access
> > > >> has to happen only after mmap finishes mapping the region.
> > > >
> > > > Hi,
> > > >
> > > > it's hard to tell, but I assume so.
> > > >
> > > > For now, forget about this go's overly complicated, hard to reproduce case and concentrate on the very nice reduced testcase in:
> > > >   https://bugzilla.kernel.org/show_bug.cgi?id=217624
> > > > ;)
> > > >
> > > > FWIW, I can reproduce using the test case too.
> >
> > Thanks for the reproducer, Jiri!
> > Let me try it and see if I can figure this one out.
>
> Interestingly I can't reproduce it with qemu emulator (reproducer
> returns 1) but my host machine with the same kernel reproduces it
> every time. Will try tracing the major code paths to see what's going
> on.
> I have to leave for a day but will resume in the evening once I'm home.

I posted a patch to disable per-VMA locks by default for now:
https://lore.kernel.org/all/20230703182150.2193578-1-surenb@xxxxxxxxxx/
Will re-enable them once we figure this issue out.
Thanks,
Suren.

> Thanks,
> Suren.
>
> >
> > > >
> > > > thanks,
> > >
> > > As another (admittedly correlation-only) data point, I noticed at least hourly crashes
> > > of Firefox-114 after upgrading to 6.4.1, which had never happened before with 6.3.x.
> > > After reverting 0bff0aaea03e2a3ed6 - with a bit of context fixup due to follow-up
> > > commits in 6.4.1 - it has been rock stable again, for several hours now.
> > >
> > > cheers
> > > Holger





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux