* Stephen Hemminger <shemminger@xxxxxxxxxx> wrote: > +void xt_info_wrlock_bh(void) > +{ > + unsigned int i; > + > + local_bh_disable(); > + for_each_possible_cpu(i) { > + write_lock(&per_cpu(xt_info_locks, i)); > +#if NR_CPUS > (PREEMPT_MASK - 1) > + /* > + * Since spin_lock disables preempt, the following is > + * required to avoid overflowing the preempt counter > + */ > + preempt_enable_no_resched(); > +#endif > + } > +} hm, this is rather ugly and it will make a lot of instrumentation code explode. Why not use the obvious solution: a _single_ wrlock for global access and read_can_lock() plus per cpu locks in the fastpath? That way there's no global cacheline bouncing (just the _reading_ of a global cacheline - which will be nicely localized - on NUMA too) - and we will hold at most 1-2 locks at once! Something like: __cacheline_aligned DEFINE_RWLOCK(global_wrlock); DEFINE_PER_CPU(rwlock_t local_lock); void local_read_lock(void) { again: read_lock(&per_cpu(local_lock, this_cpu)); if (unlikely(!read_can_lock(&global_wrlock))) { read_unlock(&per_cpu(local_lock, this_cpu)); /* * Just wait for any global write activity: */ read_unlock_wait(&global_wrlock); goto again; } } void global_write_lock(void) { write_lock(&global_wrlock); for_each_possible_cpu(i) write_unlock_wait(&per_cpu(local_lock, i)); } Note how nesting friendly this construct is: we dont actually _hold_ NR_CPUS locks all at once, we simply cycle through all CPUs and make sure they have our attention. No preempt overflow. No lockdep explosion. A very fast and scalable read path. Okay - we need to implement read_unlock_wait() and write_unlock_wait() which is similar to spin_unlock_wait(). The trivial first-approximation is: read_unlock_wait(x) { read_lock(x); read_unlock(x); } write_unlock_wait(x) { write_lock(x); write_unlock(x); } Hm? Ingo -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html