[adding the people involved in developing and applying the culprit to the list of recipients] FWIW, thread starts here: https://lore.kernel.org/all/3a1b9909-45ac-4f97-ad68-d16ef1ce99db@xxxxxxxxxxxxxxx/ On 02.03.24 09:24, Pavin Joseph wrote: > On 3/1/24 20:15, Linux regression tracking (Thorsten Leemhuis) wrote: >> Does mainline show the same problem? The answer determines who later >> will have to look into this. > Yes, I reproduced the issue on mainline and the latest stable version > 6.7.7 using your excellent guide. Thx for testing and glad to hear. Still: if you have any feedback how to make that guide even better, please let me know! >> With a bit of luck somebody might have heard about problems like yours. >> But if nobody comes up with an idea up within a few days we almost >> certainly need a bisection to get down to the root of the problem. > > Full bisection done, culprit identified, and validated by reverting > commit on mainline. I assume the latter meant "reverting the culprit on mainline fixed the problem"; if you meant something else, please let us know. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. > Attached bisection log and config used. > > Bisection final results: > 7143c5f4cf2073193eb27c9cdb84fd4655d1802d is the first bad commit > commit 7143c5f4cf2073193eb27c9cdb84fd4655d1802d > Author: Steve Wahl <steve.wahl@xxxxxxx> > Date: Fri Jan 26 10:48:41 2024 -0600 > > x86/mm/ident_map: Use gbpages only where full GB page should be mapped. > > commit d794734c9bbfe22f86686dc2909c25f5ffe1a572 upstream. > > When ident_pud_init() uses only gbpages to create identity maps, large > ranges of addresses not actually requested can be included in the > resulting table; a 4K request will map a full GB. On UV systems, this > ends up including regions that will cause hardware to halt the system > if accessed (these are marked "reserved" by BIOS). Even processor > speculation into these regions is enough to trigger the system halt. > > Only use gbpages when map creation requests include the full GB page > of space. Fall back to using smaller 2M pages when only portions of a > GB page are included in the request. > > No attempt is made to coalesce mapping requests. If a request requires > a map entry at the 2M (pmd) level, subsequent mapping requests within > the same 1G region will also be at the pmd level, even if adjacent or > overlapping such requests could have been combined to map a full > gbpage. Existing usage starts with larger regions and then adds > smaller regions, so this should not have any great consequence. > > [ dhansen: fix up comment formatting, simplifty changelog ] > > Signed-off-by: Steve Wahl <steve.wahl@xxxxxxx> > Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Link: > https://lore.kernel.org/all/20240126164841.170866-1-steve.wahl%40hpe.com > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > > arch/x86/mm/ident_map.c | 23 ++++++++++++++++++----- > 1 file changed, 18 insertions(+), 5 deletions(-) > > ---------- > > Btw, the issue appears on LTS kernel 6.6.18 as well. I didn't build this > one from the source and test, but installed it a while back from > OpenSuse Tumbleweed repos as "kernel-longterm" is a new addition and is > being actively tested over there. P.S.: #regzbot introduced d794734c9bbfe22f86686dc2909c25f5ffe1a572 #regzbot title x86/mm/ident_map: kexec now leads to reboot