Hi Peter, On 13.04.2022 18:43, Peter Xu wrote: > On Wed, Apr 13, 2022 at 04:03:28PM +0200, Marek Szyprowski wrote: >> On 05.04.2022 03:48, Peter Xu wrote: >>> We used to check against none pte in finish_fault(), with the assumption >>> that the orig_pte is always none pte. >>> >>> This change prepares us to be able to call do_fault() on !none ptes. For >>> example, we should allow that to happen for pte marker so that we can restore >>> information out of the pte markers. >>> >>> Let's change the "pte_none" check into detecting changes since we fetched >>> orig_pte. One trivial thing to take care of here is, when pmd==NULL for >>> the pgtable we may not initialize orig_pte at all in handle_pte_fault(). >>> >>> By default orig_pte will be all zeros however the problem is not all >>> architectures are using all-zeros for a none pte. pte_clear() will be the >>> right thing to use here so that we'll always have a valid orig_pte value >>> for the whole handle_pte_fault() call. >>> >>> Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> >> This patch landed in today's linux next-202204213 as commit fa6009949163 >> ("mm: check against orig_pte for finish_fault()"). Unfortunately it >> causes serious system instability on some ARM 32bit machines. I've >> observed it on all tested boards (various Samsung Exynos based, >> Raspberry Pi 3b and 4b, even QEMU's virt 32bit machine) when kernel was >> compiled from multi_v7_defconfig. > Thanks for the report. > >> Here is a crash log from QEMU's ARM 32bit virt machine: >> >> 8<--- cut here --- >> Unable to handle kernel paging request at virtual address e093263c >> [e093263c] *pgd=42083811, *pte=00000000, *ppte=00000000 >> Internal error: Oops: 807 [#1] SMP ARM >> Modules linked in: >> CPU: 1 PID: 37 Comm: kworker/u4:0 Not tainted >> 5.18.0-rc2-00176-gfa6009949163 #11684 >> Hardware name: Generic DT based system >> PC is at cpu_ca15_set_pte_ext+0x4c/0x58 >> LR is at handle_mm_fault+0x46c/0xbb0 > I had a feeling that for some reason the pte_clear() isn't working right > there when it's applying to a kernel stack variable for arm32. I'm totally > newbie to arm32, so what I'm reading is this: > > https://people.kernel.org/linusw/arm32-page-tables > > Especially: > > https://protect2.fireeye.com/v1/url?k=35bc90ac-6a27a9bd-35bd1be3-0cc47a31cdbc-b032cb1d178dc691&q=1&e=c82daefb-c86b-4ca1-8db1-cadbdc124ed2&u=https%3A%2F%2Fdflund.se%2F%7Etriad%2Fimages%2Fclassic-mmu-page-table.jpg > > It does match with what I read from arm32's proc-v7-2level.S of it, where > from the comment above cpu_v7_set_pte_ext: > > * - ptep - pointer to level 2 translation table entry > * (hardware version is stored at +2048 bytes) <---------- > > So it seems to me that arm32 needs to store some metadata at offset 0x800 > of any pte_t* pointer passed over to pte_clear(), then it must be a real > pgtable or it'll write to random places in the kernel, am I correct? > > Does it mean that all pte_*() operations upon a kernel stack var will be > wrong? I thought it could happen easily in the rest of mm too but I didn't > yet check much. The fact shows that it's mostly possible the current code > just work well with arm32 and no such violation occured yet. > > That does sound a bit tricky, IMHO. But I don't have an immediate solution > to make it less tricky.. though I have a thought of workaround, by simply > not calling pte_clear() on the stack var. > > Would you try the attached patch to replace this problematic patch? So we > need to revert commit fa6009949163 and apply the new one. Please let me > know whether it'll solve the problem, so far I only compile tested it, but > I'll run some more test to make sure the uffd-wp scenarios will be working > right with the new version. I've reverted fa6009949163 and applied the attached patch on top of linux next-20220314. The ARM 32bit issues went away. :) Feel free to add: Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland