On Fri, Sep 01, 2023 at 03:45:28AM +0500, Mikhail Gavrilov wrote: > Hi, > next release cycle, and another regression. > Yesterday after another kernel update in Fedora Rawhide system stopped booting. > Today thanks to git bisect, I found out that this is a commit: > > ❯ git bisect bad > a349d72fd9efc87c8fd1d16d3164752d84a7275b is the first bad commit > commit a349d72fd9efc87c8fd1d16d3164752d84a7275b > Author: Hugh Dickins <hughd@xxxxxxxxxx> > Date: Tue Jul 11 21:30:40 2023 -0700 > > mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s > > Patch series "mm: free retracted page table by RCU", v3. > > Some mmap_lock avoidance i.e. latency reduction. Initially just for the > case of collapsing shmem or file pages to THPs: the usefulness of > MADV_COLLAPSE on shmem is being limited by that mmap_write_lock it > currently requires. > > Likely to be relied upon later in other contexts e.g. freeing of empty > page tables (but that's not work I'm doing). mmap_write_lock avoidance > when collapsing to anon THPs? Perhaps, but again that's not work I've > done: a quick attempt was not as easy as the shmem/file case. > > These changes (though of course not these exact patches) have been in > Google's data centre kernel for three years now: we do rely upon them. > > > This patch (of 13): > > Before putting them to use (several commits later), add rcu_read_lock() to > pte_offset_map(), and rcu_read_unlock() to pte_unmap(). Make this a > separate commit, since it risks exposing imbalances: prior commits have > fixed all the known imbalances, but we may find some have been missed. > > Link: https://lkml.kernel.org/r/7cd843a9-aa80-14f-5eb2-33427363c20@xxxxxxxxxx > Link: https://lkml.kernel.org/r/d3b01da5-2a6-833c-6681-67a3e024a16f@xxxxxxxxxx > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > <long cc list omitted>... > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > include/linux/pgtable.h | 4 ++-- > mm/pgtable-generic.c | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > It looks like the hang happens so early that when booting into a > working kernel and running "journalctl -b -1" I see in the console the > log of the previous kernel which was booted before the problematic > kernel. > Therefore, I apologize that I can't provide the kernel logs. > I can provides only photos when backtrace appears on my monitor: > Here we waiting: https://ibb.co/5xmm0BH > And then I see backtrace: https://ibb.co/TLLGFNP > > Unfortunately I can't revert commit > a349d72fd9efc87c8fd1d16d3164752d84a7275b for testing more fresh builds > because of conflicts. > > My hardware: https://linux-hardware.org/?probe=dd5735f315 > I also attached kernel build config and full bisect log. > Thanks for the regression report. I'm adding it to regzbot: #regzbot ^introduced: a349d72fd9efc8 #regzbot title: rcu_read_{lock,unlock}() causes unbootable system with backtrace -- An old man doll... just what I always wanted! - Clara
Attachment:
signature.asc
Description: PGP signature