Ingo Molnar a écrit : > > Why not use the obvious solution: a _single_ wrlock for global > access and read_can_lock() plus per cpu locks in the fastpath? Obvious is not the qualifier I would use :) Brilliant yes :) > > That way there's no global cacheline bouncing (just the _reading_ of > a global cacheline - which will be nicely localized - on NUMA too) - > and we will hold at most 1-2 locks at once! > > Something like: > > __cacheline_aligned DEFINE_RWLOCK(global_wrlock); > > DEFINE_PER_CPU(rwlock_t local_lock); > > > void local_read_lock(void) > { > again: > read_lock(&per_cpu(local_lock, this_cpu)); Hmm... here we can see global_wrlock locked by on writer, while this cpu already called local_read_lock(), and calls again this function -> Deadlock, because we hold our local_lock locked. > > if (unlikely(!read_can_lock(&global_wrlock))) { > read_unlock(&per_cpu(local_lock, this_cpu)); > /* > * Just wait for any global write activity: > */ > read_unlock_wait(&global_wrlock); > goto again; > } > } > > void global_write_lock(void) > { > write_lock(&global_wrlock); > > for_each_possible_cpu(i) > write_unlock_wait(&per_cpu(local_lock, i)); > } > > Note how nesting friendly this construct is: we dont actually _hold_ > NR_CPUS locks all at once, we simply cycle through all CPUs and make > sure they have our attention. > > No preempt overflow. No lockdep explosion. A very fast and scalable > read path. > > Okay - we need to implement read_unlock_wait() and > write_unlock_wait() which is similar to spin_unlock_wait(). The > trivial first-approximation is: > > read_unlock_wait(x) > { > read_lock(x); > read_unlock(x); > } > > write_unlock_wait(x) > { > write_lock(x); > write_unlock(x); > } > Very interesting and could be changed to use spinlock + depth per cpu. -> we can detect recursion and avoid the deadlock, and we only use one atomic operation per lock/unlock pair in fastpath (this was the reason we tried hard to use a percpu spinlock during this thread) __cacheline_aligned DEFINE_RWLOCK(global_wrlock); struct ingo_local_lock { spinlock_t lock; int depth; }; DEFINE_PER_CPU(struct ingo_local_lock local_lock); void local_read_lock(void) { struct ingo_local_lock *lck; local_bh_and_preempt_disable(); lck = &get_cpu_var(local_lock); if (++lck->depth > 0) /* already locked */ return; again: spin_lock(&lck->lock); if (unlikely(!read_can_lock(&global_wrlock))) { spin_unlock(&lck->lock); /* * Just wait for any global write activity: */ read_unlock_wait(&global_wrlock); goto again; } } void global_write_lock(void) { write_lock(&global_wrlock); for_each_possible_cpu(i) spin_unlock_wait(&per_cpu(local_lock, i)); } Hmm ? -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html