Re: 6.6/regression/bisected - after commit a349d72fd9efc87c8fd1d16d3164752d84a7275b system stopped booting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 2, 2023 at 3:48 AM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> That was very disappointing: I found it hard to explain, but was thinking
> of sending you a similar patch, doing the same check on all your 32 CPUs -
> maybe the stall being on CPU 0 in your photo was accidental.
>
> But now I think I have the shameful answer (which studying your dmesg,
> and the 82328 jiffies at 86 seconds in your photo, did help me towards).
>
> That mm/pagewalk fix I put into 6.5 has a grievous oversight (and a
> video of your failing 6.6 bootup would likely have shown a WARN_ON_ONCE
> from the underflow in __rcu_read_unlock()).
>
> Please revert the debug patch I sent yesterday (or earlier today), please
> try booting with this one on top of a349d72fd9ef; and if that's successful,
> then please go back to your original Rawhide tree and apply this on top of
> that, to confirm that boots to a working system too - thanks.
>
> With my apologies,
>
> [PATCH] mm/pagewalk: fix bootstopping regression from extra pte_unmap()
>
> [ Commit message yet to be written: it's actually something to go to
> 6.5 stable, to correct i386 CONFIG_HIGHPTE there - though we know of
> no case where it is actually hit. ]
>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> ---
>  mm/pagewalk.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> index 2022333805d3..9e7d0276c38a 100644
> --- a/mm/pagewalk.c
> +++ b/mm/pagewalk.c
> @@ -58,7 +58,7 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>                         pte = pte_offset_map(pmd, addr);
>                 if (pte) {
>                         err = walk_pte_range_inner(pte, addr, end, walk);
> -                       if (walk->mm != &init_mm)
> +                       if (walk->mm != &init_mm && addr < TASK_SIZE)
>                                 pte_unmap(pte);
>                 }
>         } else {
> --
> 2.35.3

Great, this is the right patch.
Both build a349d72fd9ef and latest in Rawhide (now it is 99d99825fc07)
works fine after applying this patch.
So thank you a lot.
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>

-- 
Best Regards,
Mike Gavrilov.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux