On Wed, 26 Jun 2013, Waiman Long wrote: It would have been nice if you'd had cc'ed people who spent a massive amount of time working on the spinlock code. Did that now. > +#ifndef CONFIG_SMP > +#define __LOCK_WITH_REFCOUNT(lock, count, u_name) \ > + unsigned int count; \ > + spinlock_t lock > + > +#elif defined(__SPINLOCK_HAS_REFCOUNT) > +#define __LOCK_WITH_REFCOUNT(lock, count, u_name) \ > + union u_name { \ > + u64 __lock_count; \ > + spinlock_t lock; \ > + struct { \ > + arch_spinlock_t __raw_lock; \ > + unsigned int count; \ > + }; \ > + } > + > +#else /* __SPINLOCK_HAS_REFCOUNT */ > +#define __LOCK_WITH_REFCOUNT(lock, count, u_name) \ > + union u_name { \ > + u64 __lock_count; \ > + struct { \ > + unsigned int count; \ > + spinlock_t lock; \ > + }; \ > + } > + > +#endif /* __SPINLOCK_HAS_REFCOUNT */ This is a complete design disaster. You are violating every single rule of proper layering. The differentiation of spinlock, raw_spinlock and arch_spinlock is there for a reason and you just take the current implementation and map your desired function to it. If we take this, then we fundamentally ruled out a change to the mappings of spinlock, raw_spinlock and arch_spinlock. This layering is on purpose and it took a lot of effort to make it as clean as it is now. Your proposed change undoes that and completely wreckages preempt-rt. What's even worse is that you go and fiddle with the basic data structures: > diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h > index 73548eb..cc107ad 100644 > --- a/include/linux/spinlock_types.h > +++ b/include/linux/spinlock_types.h > @@ -17,8 +17,27 @@ > > #include <linux/lockdep.h> > > +/* > + * The presence of either one of the CONFIG_DEBUG_SPINLOCK or > + * CONFIG_DEBUG_LOCK_ALLOC configuration parameter will force the > + * spinlock_t structure to be 8-byte aligned. > + * > + * To support the spinlock/reference count combo data type for 64-bit SMP > + * environment with spinlock debugging turned on, the reference count has > + * to be integrated into the spinlock_t data structure in this special case. > + * The spinlock_t data type will be 8 bytes larger if CONFIG_GENERIC_LOCKBREAK > + * is also defined. > + */ > +#if defined(CONFIG_64BIT) && (defined(CONFIG_DEBUG_SPINLOCK) ||\ > + defined(CONFIG_DEBUG_LOCK_ALLOC)) > +#define __SPINLOCK_HAS_REFCOUNT 1 > +#endif > + > typedef struct raw_spinlock { > arch_spinlock_t raw_lock; > +#ifdef __SPINLOCK_HAS_REFCOUNT > + unsigned int count; > +#endif > #ifdef CONFIG_GENERIC_LOCKBREAK > unsigned int break_lock; > #endif You need to do that, because you blindly bolt your spinlock extension into the existing code without taking the time to think about the design implications. Do not misunderstand me, I'm impressed by the improvement numbers, but we need to find a way how this can be done sanely. You really have to understand that there is a world aside of x86 and big machines. Let's have a look at your UP implementation. It's completely broken: > +/* > + * Just add the value as the spinlock is a no-op So what protects that from being preempted? Nothing AFAICT. What's even worse is that you break all architectures, which have an arch_spinlock type != sizeof(unsigned int), e.g. SPARC So what you really want is a new data structure, e.g. something like struct spinlock_refcount() and a proper set of new accessor functions instead of an adhoc interface which is designed solely for the needs of dcache improvements. The first step to achieve your goal is to consolidate all the identical copies of arch_spinlock_t and talk to the maintainers of the architectures which have a different (i.e. !32bit) notion of the arch specific spinlock implementation, whether it's possible to move over to a common data struct. Once you have achieved this, you can build your spinlock + refcount construct upon that. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html