Re: a crash when running strace from persistent memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 3 Sep 2020, Linus Torvalds wrote:

> On Thu, Sep 3, 2020 at 12:24 PM Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
> >
> > There's a bug when you run strace from dax-based filesystem.
> >
> > -- create real or emulated persistent memory device (/dev/pmem0)
> > mkfs.ext2 /dev/pmem0
> > -- mount it
> > mount -t ext2 -o dax /dev/pmem0 /mnt/test
> > -- copy the system to it (well, you can copy just a few files that are
> >    needed for running strace and ls)
> > cp -ax / /mnt/test
> > -- bind the system directories
> > mount --bind /dev /mnt/test/dev
> > mount --bind /proc /mnt/test/proc
> > mount --bind /sys /mnt/test/sys
> > -- run strace on the ls command
> > chroot /mnt/test/ strace /bin/ls
> >
> > You get this warning and ls is killed with SIGSEGV.
> >
> > I bisected the problem and it is caused by the commit
> > 17839856fd588f4ab6b789f482ed3ffd7c403e1f (gup: document and work around
> > "COW can break either way" issue). When I revert the patch (on the kernel
> > 5.9-rc3), the bug goes away.
> 
> Funky. I really don't see how it could cause that, but we have the
> UDDF issue too, so I'm guessing I will have to fix it the radical way
> with Peter Xu's series based on my "rip out COW special cases" patch.
> 
> Or maybe I'm just using that as an excuse for really wanting to apply
> that series.. Because we can't just revert that GUP commit due to
> security concerns.
> 
> > [   84.191504] WARNING: CPU: 6 PID: 1350 at mm/memory.c:2486 wp_page_copy.cold+0xdb/0xf6
> 
> I'm assuming this is the WARN_ON_ONCE(1) on line 2482, and you have
> some extra debug patch that causes that line to be off by 4? Because
> at least for me, line 2486 is actually an empty line in v5.9-rc3.

Yes, that's it. I added a few printk to look at the control flow.

> That said, I really think this is a pre-existing race, and all the
> "COW can break either way" patch does is change the timing (presumably
> due to the actual pattern of actually doing the COW changing).
> 
> See commit c3e5ea6ee574 ("mm: avoid data corruption on CoW fault into
> PFN-mapped VMA") for background.
> 
> Mikulas, can you check that everything works ok for that case if you
> apply Peter's series? See
> 
>     https://lore.kernel.org/lkml/20200821234958.7896-1-peterx@xxxxxxxxxx/

I applied these four patches and strace works well. There is no longer any 
warning or crash.

Mikulas

> or if you have 'b4' installed, use
> 
>     b4 am 20200821234958.7896-1-peterx@xxxxxxxxxx
> 
> to get the series..
> 
>                      Linus
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux