Hi Paul, On Sun, Jul 28, 2013 at 09:16:29PM +0100, Paul Walmsley wrote: > > Commit 621a0147d5c921f4cc33636ccd0602ad5d7cbfbc ("ARM: 7757/1: mm: > don't flush icache in switch_mm with hardware broadcasting") breaks > the boot on OMAP2430SDP with omap2plus_defconfig. Tracked to an > undefined instruction abort from the CP15 read in > cache_ops_need_broadcast(). It turns out that gcc reorders the > extended CP15 read above the is_smp() test. This breaks ARM1136 r0 > cores, since they don't support several CP15 registers that later ARM > cores do. ARM1136JF-S TRM section 3.2.1 "Register allocation" has the > details. Cheers for tracking this down. Interestingly, I can't reproduce this with anything other than GCC 4.5.* tools -- 4.6+ do what we want. Still, it looks like a valid (if not misguided) thing to do. > diff --git a/arch/arm/include/asm/cputype.h b/arch/arm/include/asm/cputype.h > index 8c25dc4..f428eb0 100644 > --- a/arch/arm/include/asm/cputype.h > +++ b/arch/arm/include/asm/cputype.h > @@ -89,13 +89,25 @@ extern unsigned int processor_id; > __val; \ > }) > > + > +# if defined(CONFIG_CPU_V6) > +/* > + * The mrc in the read_cpuid_ext macro must not be reordered on ARMv6, > + * else the compiler may move it before an is_smp() test, causing > + * undefined instruction aborts on ARM1136 r0. > + */ > +# define CPUID_EXT_REORDER "cc", "memory" > +# else > +# define CPUID_EXT_REORDER "cc" > +# endif > + > #define read_cpuid_ext(ext_reg) \ > ({ \ > unsigned int __val; \ > asm("mrc p15, 0, %0, c0, " ext_reg \ > : "=r" (__val) \ > : \ > - : "cc"); \ > + : CPUID_EXT_REORDER); \ > __val; \ > }) I wouldn't worry about checking for CPU_V6. Besides, we probably need this to be re-evaluated across barrier() when we get CPU migration on a big-little platform anyway (we should probably also drop the __attribute_const__ for that). So you can just replace the "cc" (now that Nico kindly explained why those aren't needed the other day) with "memory". An alternative is to add barrier() between is_smp() and the read_cpuid_ext() in all callers, adding a fake read from the stack to the latter (like I did for the per-cpu accessor). However, this relies on fixing all callers for very little gain, so I don't think it's worth the hassle. I can cook a patch if you're tied up with other things -- just let me know. Cheers, Will -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html