On Wed, 26 Jun 2013, Waiman Long wrote: > On 06/26/2013 07:06 PM, Thomas Gleixner wrote: > > On Wed, 26 Jun 2013, Waiman Long wrote: > > This is a complete design disaster. You are violating every single > > rule of proper layering. The differentiation of spinlock, raw_spinlock > > and arch_spinlock is there for a reason and you just take the current > > implementation and map your desired function to it. If we take this, > > then we fundamentally ruled out a change to the mappings of spinlock, > > raw_spinlock and arch_spinlock. This layering is on purpose and it > > took a lot of effort to make it as clean as it is now. Your proposed > > change undoes that and completely wreckages preempt-rt. > > Would you mind enlighten me how this change will break preempt-rt? The whole spinlock layering was done for RT. In mainline spinlock is mapped to raw_spinlock. On RT spinlock is mapped to a PI aware rtmutex. So your mixing of the various layers and the assumption of lock plus count being adjacent, does not work on RT at all. > The only architecture that will break, according to data in the > respectively arch/*/include/asm/spinlock_types.h files is PA-RISC > 1.x (it is OK in PA-RISC 2.0) whose arch_spinlock type has a size of > 16 bytes. I am not sure if 32-bit PA-RISC 1.x is still supported or > not, but we can always disable the optimization for this case. You have to do that right from the beginning with a proper data structure and proper accessor functions. Introducing wreckage and then retroactivly fixing it, is not a really good idea. > > So what you really want is a new data structure, e.g. something like > > struct spinlock_refcount() and a proper set of new accessor functions > > instead of an adhoc interface which is designed solely for the needs > > of dcache improvements. > > I had thought about that. The only downside is we need more code changes to > make existing code to use the new infrastructure. One of the drivers in my That's not a downside. It makes the usage of the infrastructure more obvious and not hidden behind macro magic. And the changes are trivial to script. > design is to minimize change to existing code. Personally, I have no > objection of doing that if others think this is the right way to go. Definitely. Something like this would be ok: struct lock_refcnt { int refcnt; struct spinlock lock; }; It does not require a gazillion of ifdefs and works for UP,SMP,DEBUG.... extern bool lock_refcnt_mod(struct lock_refcnt *lr, int mod, int cond); You also want something like: extern void lock_refcnt_disable(void); So we can runtime disable it e.g. when lock elision is detected and active. So you can force lock_refcnt_mod() to return false. static inline bool lock_refcnt_inc(struct lock_refcnt *sr) { #ifdef CONFIG_HAVE_LOCK_REFCNT return lock_refcnt_mod(sr, 1, INTMAX); #else return false; #endif } That does not break any code as long as CONFIG_HAVE_SPINLOCK_REFCNT=n. So we can enable it per architecture and make it depend on SMP. For RT we simply can force this switch to n. The other question is the semantic of these refcount functions. From your description the usage pattern is: if (refcnt_xxx()) return; /* Slow path */ spin_lock(); ... spin_unlock(); So it might be sensible to make this explicit: static inline bool refcnt_inc_or_lock(struct lock_refcnt *lr)) { #ifdef CONFIG_HAVE_SPINLOCK_REFCNT if (lock_refcnt_mod(lr, 1, INTMAX)) return true; #endif spin_lock(&lr->lock); return false; } Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html