On 02/03/19 03:44, Fenghua Yu wrote: > cpu_caps_cleared and cpu_caps_set may not be aligned to unsigned long. > Atomic operations (i.e. set_bit and clear_bit) on the bitmaps may access > two cache lines (a.k.a. split lock) and lock bus to block all memory > accesses from other processors to ensure atomicity. > > To avoid the overall performance degradation from the bus locking, align > the two variables to unsigned long. > > Defining the variables as unsigned long may also fix the issue because > they are naturally aligned to unsigned long. But that needs additional > code changes. Adding __aligned(unsigned long) are simpler fixes. > > Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx> > --- > arch/x86/kernel/cpu/common.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c > index cb28e98a0659..51ab37ba5f64 100644 > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -488,8 +488,9 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c) > return NULL; /* Not found */ > } > > -__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS]; > -__u32 cpu_caps_set[NCAPINTS + NBUGINTS]; > +/* Unsigned long alignment to avoid split lock in atomic bitmap ops */ > +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); > +__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long)); > > void load_percpu_segment(int cpu) > { > (resending including the list) Why not instead change set_bit/clear_bit to use btsl/btrl instead of btsq/btrq? Thanks, Paolo