On Thu, 18 Apr 2019, David Laight wrote: > From: David Laight > > Sent: 18 April 2019 10:21 > > From: Fenghua Yu > > > Sent: 17 April 2019 22:34 > > > > > > set_cpu_cap() calls locked BTS and clear_cpu_cap() calls locked BTR to > > > operate on bitmap defined in x86_capability. > > > > > > Locked BTS/BTR accesses a single unsigned long location. In 64-bit mode, > > > the location is at: > > > base address of x86_capability + (bit offset in x86_capability / 64) * 8 > > > > > > Since base address of x86_capability may not be aligned to unsigned long, > > > the single unsigned long location may cross two cache lines and > > > accessing the location by locked BTS/BTR introductions will cause > > > split lock. > > > > Isn't the problem that the type (and definition) of x86_capability[] are wrong. > > If the 'bitmap' functions are used for it, it should be defined as a bitmap. > > This would make it 'unsigned long' not __u32. > > > > This type munging of bitmaps only works on LE systems. > > > > OTOH the locked BTS/BTR instructions could be changed to use 32 bit accesses. > > ISTR some of the associated functions use byte accesses. > > > > Perhaps there ought to be asm wrappers for BTS/BTR that do 8bit and > > 32bit accesses. > > A quick look shows that this isn't the only __32[] that is being > cast to (unsigned long) and then to set/test/clear_bit() in those > files. > > I wonder how much other code is applying such casts? The reason for the cpuid stuff using u32 is that this is actually the width of the information retrieved from CPUID. Thanks, tglx