On Tue, Nov 5, 2019 at 2:29 AM Will Deacon <will@xxxxxxxxxx> wrote: > > Hi John, > > On Mon, Nov 04, 2019 at 05:16:42PM -0800, John Stultz wrote: > > On Tue, Oct 29, 2019 at 8:31 AM Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > > > > > Shared and writable mappings (__S.1.) should be clean (!dirty) initially > > > and made dirty on a subsequent write either through the hardware DBM > > > (dirty bit management) mechanism or through a write page fault. A clean > > > pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY > > > clear. > > > > > > The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and > > > PTE_DIRTY clear. Prior to commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY > > > bit handling out of set_pte_at()"), it was the responsibility of > > > set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the > > > software PTE_DIRTY bit was not set. However, the above commit removed > > > the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in > > > set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions > > > unchanged. The result is that shared+writable mappings are now dirty by > > > default > > > > > > Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}. > > > In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_* > > > attributes. > > > > > > Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") > > > Cc: <stable@xxxxxxxxxxxxxxx> # 4.14.x- > > > Cc: Will Deacon <will@xxxxxxxxxx> > > > Signed-off-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > > > Hey, > > So I'm not yet sure why, but I've just validated that this patch is > > causing trouble with booting AOSP on HiKey960 with 5.4-rc6 (-rc5 works > > fine). > > Hmm. Annoying this wasn't spotted by CI. > > > Its odd, because the system does boot and is alive, but seems to stall > > out at the boot animation, and userland never finishes coming up to > > the home screen. It just sits there without a useful error message > > that I can find so far. Reverting just this patch seems to solve it > > and it boots all the way. > > Given that I don't think the HiKey960 supports h/w DBM, my initial guess > is that the GPU is stuck on a page fault. > > > I'll try to dig further to see what might be going on (the mali driver > > is a prime suspect here), but I wanted to raise the flag since we're > > at the end of the -rc cycle. > > What exactly are you using for the mali driver? I've got an old r10p0 bifrost blob we were given and kernel patches I've carried forward since then. Again, I don't want to distract you too much for something that may be related to a blob driver. I mostly just wanted to raise a flag in case there was something off that might affect others. > As an experiment, can you try reverting just the part of the patch that > removes PTE_DIRTY from the PROT_* definitions? (see below) I'll give this a try! Feel free to let me know if there's anything else I should test. thanks -john