On Fri, 1 Sep 2023, Mikhail Gavrilov wrote: > On Fri, Sep 1, 2023 at 2:08 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > > > > Sorry about that, please try this instead, adds EXPORT_SYMBOL(pte_unmap). > > > > Thanks, now I have a working kernel builded at commit a349d72fd9ef. > > > I've never used stackdepot before, but I've tried this out in good and > > bad cases, and expect it to work for you, shedding light on where is > > going wrong - machine should boot up fine, and in dmesg you'll find one > > stacktrace between "WARNING: pte_map..." and "End of pte_map..." lines. > > Interesting, I checked twice but I didn't find any entry with > "pte_map" in the kernel log after applying your patch. That was very disappointing: I found it hard to explain, but was thinking of sending you a similar patch, doing the same check on all your 32 CPUs - maybe the stall being on CPU 0 in your photo was accidental. But now I think I have the shameful answer (which studying your dmesg, and the 82328 jiffies at 86 seconds in your photo, did help me towards). That mm/pagewalk fix I put into 6.5 has a grievous oversight (and a video of your failing 6.6 bootup would likely have shown a WARN_ON_ONCE from the underflow in __rcu_read_unlock()). Please revert the debug patch I sent yesterday (or earlier today), please try booting with this one on top of a349d72fd9ef; and if that's successful, then please go back to your original Rawhide tree and apply this on top of that, to confirm that boots to a working system too - thanks. With my apologies, [PATCH] mm/pagewalk: fix bootstopping regression from extra pte_unmap() [ Commit message yet to be written: it's actually something to go to 6.5 stable, to correct i386 CONFIG_HIGHPTE there - though we know of no case where it is actually hit. ] Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> --- mm/pagewalk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 2022333805d3..9e7d0276c38a 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -58,7 +58,7 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, pte = pte_offset_map(pmd, addr); if (pte) { err = walk_pte_range_inner(pte, addr, end, walk); - if (walk->mm != &init_mm) + if (walk->mm != &init_mm && addr < TASK_SIZE) pte_unmap(pte); } } else { -- 2.35.3