On Tue, Feb 26, 2013 at 3:26 AM, Srivatsa S. Bhat <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote: > Hi Lai, > > On 02/25/2013 09:23 PM, Lai Jiangshan wrote: >> Hi, Srivatsa, >> >> The target of the whole patchset is nice for me. > > Cool! Thanks :-) > >> A question: How did you find out the such usages of >> "preempt_disable()" and convert them? did all are converted? >> > > Well, I scanned through the source tree for usages which implicitly > disabled CPU offline and converted them over. How do you scan? could you show the way you scan the source tree. I can follow your instructions for double checking. > Its not limited to uses > of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable() > etc also help disable CPU offline. So I tried to dig out all such uses > and converted them. However, since the merge window is open, a lot of > new code is flowing into the tree. So I'll have to rescan the tree to > see if there are any more places to convert. I remember some code has such assumption: preempt_disable() (or something else) //the code assume that the cpu_online_map can't be changed. preempt_enable() It is very hard to find out all such kinds of assumptions and fixes them. (I notice your code mainly fixes code around send_xxxx()) > >> And I think the lock is too complex and reinvent the wheel, why don't >> you reuse the lglock? > > lglocks? No way! ;-) See below... > >> I wrote an untested draft here. >> >> Thanks, >> Lai >> >> PS: Some HA tools(I'm writing one) which takes checkpoints of >> virtual-machines frequently, I guess this patchset can speedup the >> tools. >> >> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001 >> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxx> >> Date: Mon, 25 Feb 2013 23:14:27 +0800 >> Subject: [PATCH] lglock: add read-preference local-global rwlock >> >> locality via lglock(trylock) >> read-preference read-write-lock via fallback rwlock_t >> >> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx> >> --- >> include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++ >> kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 76 insertions(+), 0 deletions(-) >> >> diff --git a/include/linux/lglock.h b/include/linux/lglock.h >> index 0d24e93..30fe887 100644 >> --- a/include/linux/lglock.h >> +++ b/include/linux/lglock.h >> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu); >> void lg_global_lock(struct lglock *lg); >> void lg_global_unlock(struct lglock *lg); >> >> +struct lgrwlock { >> + unsigned long __percpu *fallback_reader_refcnt; >> + struct lglock lglock; >> + rwlock_t fallback_rwlock; >> +}; >> + >> +#define DEFINE_LGRWLOCK(name) \ >> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ >> + = __ARCH_SPIN_LOCK_UNLOCKED; \ >> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ >> + struct lgrwlock name = { \ >> + .fallback_reader_refcnt = &name ## _refcnt, \ >> + .lglock = { .lock = &name ## _lock } } >> + >> +#define DEFINE_STATIC_LGRWLOCK(name) \ >> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ >> + = __ARCH_SPIN_LOCK_UNLOCKED; \ >> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \ >> + static struct lgrwlock name = { \ >> + .fallback_reader_refcnt = &name ## _refcnt, \ >> + .lglock = { .lock = &name ## _lock } } >> + >> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name) >> +{ >> + lg_lock_init(&lgrw->lglock, name); >> +} >> + >> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw); >> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw); >> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw); >> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw); >> #endif >> diff --git a/kernel/lglock.c b/kernel/lglock.c >> index 6535a66..463543a 100644 >> --- a/kernel/lglock.c >> +++ b/kernel/lglock.c >> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg) >> preempt_enable(); >> } >> EXPORT_SYMBOL(lg_global_unlock); >> + >> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) >> +{ >> + struct lglock *lg = &lgrw->lglock; >> + >> + preempt_disable(); >> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { >> + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) { >> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); >> + return; >> + } >> + read_lock(&lgrw->fallback_rwlock); >> + } >> + >> + __this_cpu_inc(*lgrw->fallback_reader_refcnt); >> +} >> +EXPORT_SYMBOL(lg_rwlock_local_read_lock); >> + >> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) >> +{ >> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) { >> + lg_local_unlock(&lgrw->lglock); >> + return; >> + } >> + >> + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt)) >> + read_unlock(&lgrw->fallback_rwlock); >> + >> + preempt_enable(); >> +} >> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock); >> + > > If I read the code above correctly, all you are doing is implementing a > recursive reader-side primitive (ie., allowing the reader to call these > functions recursively, without resulting in a self-deadlock). > > But the thing is, making the reader-side recursive is the least of our > problems! Our main challenge is to make the locking extremely flexible > and also safe-guard it against circular-locking-dependencies and deadlocks. > Please take a look at the changelog of patch 1 - it explains the situation > with an example. > >> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw) >> +{ >> + lg_global_lock(&lgrw->lglock); > > This does a for-loop on all CPUs and takes their locks one-by-one. That's > exactly what we want to prevent, because that is the _source_ of all our > deadlock woes in this case. In the presence of perfect lock ordering > guarantees, this wouldn't have been a problem (that's why lglocks are > being used successfully elsewhere in the kernel). In the stop-machine() > removal case, the over-flexibility of preempt_disable() forces us to provide > an equally flexible locking alternative. Hence we can't use such per-cpu > locking schemes. > > You might note that, for exactly this reason, I haven't actually used any > per-cpu _locks_ in this synchronization scheme, though it is named as > "per-cpu rwlocks". The only per-cpu component here are the refcounts, and > we consciously avoid waiting/spinning on them (because then that would be > equivalent to having per-cpu locks, which are deadlock-prone). We use > global rwlocks to get the deadlock-safety that we need. > >> + write_lock(&lgrw->fallback_rwlock); >> +} >> +EXPORT_SYMBOL(lg_rwlock_global_write_lock); >> + >> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw) >> +{ >> + write_unlock(&lgrw->fallback_rwlock); >> + lg_global_unlock(&lgrw->lglock); >> +} >> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock); >> > > Regards, > Srivatsa S. Bhat > -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html