Re: [PATCH 6.6 000/389] 6.6.76-rc2 review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 17, 2025 at 05:00:43PM +0530, Naresh Kamboju wrote:
> On Sat, 8 Feb 2025 at 16:54, Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote:
[...]
> We observed a kernel warning on QEMU-ARM64 and FVP while running the
> newly added selftest: arm64: check_hugetlb_options. This issue appears
> on 6.6.76 onward and 6.12.13 onward, as reported in the stable review [1].
> However, the test case passes successfully on stable 6.13.
> 
> The selftests: arm64: check_hugetlb_options test was introduced following
> the recent upgrade of kselftest test sources to the stable 6.13 branch.
> As you are aware, LKFT runs the latest kselftest sources (from stable
> 6.13.x) on 6.12.x, 6.6.x, and older kernels for validation purposes.
> 
> From Anders' bisection results, we identified that the missing patch on
> 6.12 is likely causing this regression:
> 
> First fixed commit:
> [25c17c4b55def92a01e3eecc9c775a6ee25ca20f]
> hugetlb: arm64: add MTE support

I wouldn't backport this and it's definitely not a fix for the problem
reported.

> Could you confirm whether this patch is eligible for backporting to
> 6.12 and 6.6 kernels?
> If backporting is not an option, we will need to skip running this
> test case on older kernels.
> 
> > 1)
> > Regression on qemu-arm64 and FVP noticed this kernel warning running
> > selftests: arm64: check_hugetlb_options test case on 6.6.76-rc1 and
> > 6.6.76-rc2.
> >
> > Test regression: WARNING-arch-arm64-mm-copypage-copy_highpage
> >
> > ------------[ cut here ]------------
> > [   96.920028] WARNING: CPU: 1 PID: 3611 at
> > arch/arm64/mm/copypage.c:29 copy_highpage
> > (arch/arm64/include/asm/mte.h:87)
> > [   96.922100] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce
> > sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables
> > [   96.925603] CPU: 1 PID: 3611 Comm: check_hugetlb_o Not tainted 6.6.76-rc2 #1
> > [   96.926956] Hardware name: linux,dummy-virt (DT)
> > [   96.927695] pstate: 43402009 (nZcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> > [   96.928687] pc : copy_highpage (arch/arm64/include/asm/mte.h:87)
> > [   96.929037] lr : copy_highpage
> > (arch/arm64/include/asm/alternative-macros.h:232
> > arch/arm64/include/asm/cpufeature.h:443
> > arch/arm64/include/asm/cpufeature.h:504
> > arch/arm64/include/asm/cpufeature.h:814 arch/arm64/mm/copypage.c:27)
> > [   96.929399] sp : ffff800088aa3ab0
> > [   96.930232] x29: ffff800088aa3ab0 x28: 00000000000001ff x27: 0000000000000000
> > [   96.930784] x26: 0000000000000000 x25: 0000ffff9b800000 x24: 0000ffff9b9ff000
> > [   96.931402] x23: fffffc0003257fc0 x22: ffff0000c95ff000 x21: ffff0000c93ff000
> > [   96.932054] x20: fffffc0003257fc0 x19: fffffc000324ffc0 x18: 0000ffff9b800000
> > [   96.933357] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> > [   96.934091] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> > [   96.935095] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
> > [   96.935982] x8 : 0bfffc0001800000 x7 : 0000000000000000 x6 : 0000000000000000
> > [   96.936536] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> > [   96.937089] x2 : 0000000000000000 x1 : ffff0000c9600000 x0 : ffff0000c9400080
> > [   96.939431] Call trace:
> > [   96.939920] copy_highpage (arch/arm64/include/asm/mte.h:87)
> > [   96.940443] copy_user_highpage (arch/arm64/mm/copypage.c:40)
> > [   96.940963] copy_user_large_folio (mm/memory.c:5977 mm/memory.c:6109)
> > [   96.941535] hugetlb_wp (mm/hugetlb.c:5701)
> > [   96.941948] hugetlb_fault (mm/hugetlb.c:6237)
> > [   96.942344] handle_mm_fault (mm/memory.c:5330)
> > [   96.942794] do_page_fault (arch/arm64/mm/fault.c:513
> > arch/arm64/mm/fault.c:626)
> > [   96.943341] do_mem_abort (arch/arm64/mm/fault.c:846)
> > [   96.943797] el0_da (arch/arm64/kernel/entry-common.c:133
> > arch/arm64/kernel/entry-common.c:144
> > arch/arm64/kernel/entry-common.c:547)
> > [   96.944229] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:0)
> > [   96.944765] el0t_64_sync (arch/arm64/kernel/entry.S:599)
> > [   96.945383] ---[ end trace 0000000000000000 ]---

Prior to commit 25c17c4b55de ("hugetlb: arm64: add mte support"), there
was no hugetlb support with MTE, so the above code path should not
happen - it seems to get a PROT_MTE hugetlb page which should have been
prevented by arch_validate_flags(). Or something else corrupts the page
flags and we end up with some random PG_mte_tagged set.

Does this happen with vanilla 6.6? I wonder whether we always had this
issue, only that we haven't noticed until the hugetlb MTE kselftest.
There were some backports in this area but I don't see how they would
have caused this.

-- 
Catalin




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux