On 7/16/19 10:29 AM, Alex Kogan wrote: > >> On Jul 15, 2019, at 7:22 PM, Waiman Long <longman@xxxxxxxxxx >> <mailto:longman@xxxxxxxxxx>> wrote: >> >> On 7/15/19 5:30 PM, Waiman Long wrote: >>>> -#ifndef _GEN_PV_LOCK_SLOWPATH >>>> +#if !defined(_GEN_PV_LOCK_SLOWPATH) && !defined(_GEN_CNA_LOCK_SLOWPATH) >>>> >>>> #include <linux/smp.h> >>>> #include <linux/bug.h> >>>> @@ -77,18 +77,14 @@ >>>> #define MAX_NODES 4 >>>> >>>> /* >>>> - * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in >>>> - * size and four of them will fit nicely in one 64-byte cacheline. For >>>> - * pvqspinlock, however, we need more space for extra data. To accommodate >>>> - * that, we insert two more long words to pad it up to 32 bytes. IOW, only >>>> - * two of them can fit in a cacheline in this case. That is OK as it is rare >>>> - * to have more than 2 levels of slowpath nesting in actual use. We don't >>>> - * want to penalize pvqspinlocks to optimize for a rare case in native >>>> - * qspinlocks. >>>> + * On 64-bit architectures, the mcs_spinlock structure will be 20 bytes in >>>> + * size. For pvqspinlock or the NUMA-aware variant, however, we need more >>>> + * space for extra data. To accommodate that, we insert two more long words >>>> + * to pad it up to 36 bytes. >>>> */ >>> The 20 bytes figure is wrong. It is actually 24 bytes for 64-bit as the >>> mcs_spinlock structure is 8-byte aligned. For better cacheline >>> alignment, I will like to keep mcs_spinlock to 16 bytes as before. >>> Instead, you can use encode_tail() to store the CNA node pointer in >>> "locked". For instance, use (encode_tail() << 1) in locked to >>> distinguish it from the regular locked=1 value. >> >> Actually, the encoded tail value is already shift left either 16 bits >> or 9 bits. So there is no need to shift it. You can assigned it directly: >> >> mcs->locked = cna->encoded_tail; >> >> You do need to change the type of locked to "unsigned int", though, >> for proper comparison with "1". >> > Got it, thanks. > I forgot to mention that I would like to see a boot command line option to force off and maybe on as well the numa qspinlock code. This can help in testing as you don't need to build 2 separate kernels, one with NUMA_AWARE_SPINLOCKS on and one with it off. For small 2-socket systems, numa qspinlock may not help much. So an option to turn it off can be useful. Xen also have an option to turns off PV qspinlock. -Longman