On Mon, May 20, 2024 at 01:43:42PM +0300, Andy Shevchenko wrote: > On Fri, May 17, 2024 at 01:05:34PM -0700, Dave Hansen wrote: > > > > From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > > > > tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH] > > will end up reporting cache_line_size()==0 and bad things happen. > > Fill in a default on those to avoid the problem. > > > > Long Story: > > > > The kernel dies a horrible death if c->x86_cache_alignment (aka. > > cache_line_size() is 0. Normally, this value is populated from > > Missing ) ? > > > c->x86_clflush_size. > > > > Right now the code is set up to get c->x86_clflush_size from two > > places. First, modern CPUs get it from CPUID. Old CPUs that don't > > have leaf 0x80000008 (or CPUID at all) just get some sane defaults > > from the kernel in get_cpu_address_sizes(). > > > > The vast majority of CPUs that have leaf 0x80000008 also get > > ->x86_clflush_size from CPUID. But there are oddballs. > > > > Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set > > CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size: > > > > cpuid(0x00000001, &tfms, &misc, &junk, &cap0); > > if (cap0 & (1<<19)) > > c->x86_clflush_size = ((misc >> 8) & 0xff) * 8; > > > > So they: land in get_cpu_address_sizes(), set vp_bits_from_cpuid=0 and > > never fill in c->x86_clflush_size, assign c->x86_cache_alignment, and > > hilarity ensues in code like: > > > > buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()), > > GFP_KERNEL); > > > > To fix this, always provide a sane value for ->x86_clflush_size. > > > > Big thanks to Andy Shevchenko for finding and reporting this and also > > providing a first pass at a fix. But his fix was only partial and only > > worked on the Quark CPUs. It would not, for instance, have worked on > > the QEMU config. > > > > 1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel/GenuineIntel0000590_Clanton_03_CPUID.txt > > 2. You can also get this behavior if you use "-cpu 486,+clzero" > > in QEMU. > > Tested-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx> > > (as this obviously fixes the issue as it makes a partial revert of the culprit > change). What's the status of this? (It seems you have to rebase it on top of the existing patches in the same area). -- With Best Regards, Andy Shevchenko