On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote: > Rescheduling while holding a spin lock is essential for keeping long > running kernel operations running smoothly. Add the facility to > cond_resched rwlocks. This adds two new exports and two new macros without any in-tree users, which is generally frowned upon. You and I know these will be used by KVM's new TDP MMU, but the non-KVM folks, and more importantly the maintainers of this code, are undoubtedly going to ask "why". I.e. these patches probably belong in the KVM series to switch to a rwlock for the TDP MMU. Regarding the code, it's all copy-pasted from the spinlock code and darn near identical. It might be worth adding builder macros for these. > Signed-off-by: Ben Gardon <bgardon@xxxxxxxxxx> > --- > include/linux/sched.h | 12 ++++++++++++ > kernel/sched/core.c | 40 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 52 insertions(+) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 77179160ec3ab..2eb0c53fce115 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1841,12 +1841,24 @@ static inline int _cond_resched(void) { return 0; } > }) > > extern int __cond_resched_lock(spinlock_t *lock); > +extern int __cond_resched_rwlock_read(rwlock_t *lock); > +extern int __cond_resched_rwlock_write(rwlock_t *lock); > > #define cond_resched_lock(lock) ({ \ > ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\ > __cond_resched_lock(lock); \ > }) > > +#define cond_resched_rwlock_read(lock) ({ \ > + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ > + __cond_resched_rwlock_read(lock); \ > +}) > + > +#define cond_resched_rwlock_write(lock) ({ \ > + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \ > + __cond_resched_rwlock_write(lock); \ > +}) > + > static inline void cond_resched_rcu(void) > { > #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU) > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index d2003a7d5ab55..ac58e7829a063 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock) > } > EXPORT_SYMBOL(__cond_resched_lock); > > +int __cond_resched_rwlock_read(rwlock_t *lock) > +{ > + int resched = should_resched(PREEMPT_LOCK_OFFSET); > + int ret = 0; > + > + lockdep_assert_held(lock); > + > + if (rwlock_needbreak(lock) || resched) { > + read_unlock(lock); > + if (resched) > + preempt_schedule_common(); > + else > + cpu_relax(); > + ret = 1; AFAICT, this rather odd code flow from __cond_resched_lock() is an artifact of code changes over the years and not intentionally weird. IMO, it would be cleaner and easier to read as: int resched = should_resched(PREEMPT_LOCK_OFFSET); lockdep_assert_held(lock); if (!rwlock_needbreak(lock) && !resched) return 0; read_unlock(lock); if (resched) preempt_schedule_common(); else cpu_relax(); read_lock(lock) return 1; > + read_lock(lock); > + } > + return ret; > +} > +EXPORT_SYMBOL(__cond_resched_rwlock_read); > + > +int __cond_resched_rwlock_write(rwlock_t *lock) > +{ > + int resched = should_resched(PREEMPT_LOCK_OFFSET); > + int ret = 0; > + > + lockdep_assert_held(lock); This shoulid be lockdep_assert_held_write. > + > + if (rwlock_needbreak(lock) || resched) { > + write_unlock(lock); > + if (resched) > + preempt_schedule_common(); > + else > + cpu_relax(); > + ret = 1; > + write_lock(lock); > + } > + return ret; > +} > +EXPORT_SYMBOL(__cond_resched_rwlock_write); > + > /** > * yield - yield the current processor to other threads. > * > -- > 2.29.0.rc2.309.g374f81d7ae-goog >