Re: [PATCH] arm64: Ensure VM_WRITE|VM_SHARED ptes are clean by default

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 5, 2019 at 9:06 AM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> On Tue, Nov 5, 2019 at 2:29 AM Will Deacon <will@xxxxxxxxxx> wrote:
> >
> > Hi John,
> >
> > On Mon, Nov 04, 2019 at 05:16:42PM -0800, John Stultz wrote:
> > > On Tue, Oct 29, 2019 at 8:31 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > > >
> > > > Shared and writable mappings (__S.1.) should be clean (!dirty) initially
> > > > and made dirty on a subsequent write either through the hardware DBM
> > > > (dirty bit management) mechanism or through a write page fault. A clean
> > > > pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY
> > > > clear.
> > > >
> > > > The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and
> > > > PTE_DIRTY clear. Prior to commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY
> > > > bit handling out of set_pte_at()"), it was the responsibility of
> > > > set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the
> > > > software PTE_DIRTY bit was not set. However, the above commit removed
> > > > the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in
> > > > set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions
> > > > unchanged. The result is that shared+writable mappings are now dirty by
> > > > default
> > > >
> > > > Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}.
> > > > In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_*
> > > > attributes.
> > > >
> > > > Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()")
> > > > Cc: <stable@xxxxxxxxxxxxxxx> # 4.14.x-
> > > > Cc: Will Deacon <will@xxxxxxxxxx>
> > > > Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> > >
> > > Hey,
> > >   So I'm not yet sure why, but I've just validated that this patch is
> > > causing trouble with booting AOSP on HiKey960 with 5.4-rc6 (-rc5 works
> > > fine).
> >
> > Hmm. Annoying this wasn't spotted by CI.
> >
> > > Its odd, because the system does boot and is alive, but seems to stall
> > > out at the boot animation, and userland never finishes coming up to
> > > the home screen. It just sits there without a useful error message
> > > that I can find so far.  Reverting just this patch seems to solve it
> > > and it boots all the way.
> >
> > Given that I don't think the HiKey960 supports h/w DBM, my initial guess
> > is that the GPU is stuck on a page fault.
> >
> > > I'll try to dig further to see what might be going on (the mali driver
> > > is a prime suspect here), but I wanted to raise the flag since we're
> > > at the end of the -rc cycle.
> >
> > What exactly are you using for the mali driver?
>
> I've got an old r10p0 bifrost blob we were given and kernel patches
> I've carried forward since then.
>
> Again, I don't want to distract you too much for something that may be
> related to a blob driver. I mostly just wanted to raise a flag in case
> there was something off that might affect others.

Just as a further detail (about to close up for the day), I'm also
seeing this issue on the HiKey board as well. Similarly reverting
747a70e60b72 resolves it.
Its a mali blob driver too, but a different one (utgard) which makes
me suspect this might be a real issue w/ something in AOSP.

I'll be testing on a db845c tomorrow morning to see if I can trigger
it there as well.

thanks
-john



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux