On Fri, 8 Mar 2024 at 22:41, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > Tasks Trace RCU needs a single-byte cmpxchg(), but no such thing exists. Because not all architectures support 1-byte cmpxchg? What prevents us from implementing it? > Therefore, rcu_trc_cmpxchg_need_qs() emulates one using field substitution > and a four-byte cmpxchg(), such that the other three bytes are always > atomically updated to their old values. This works, but results in > false-positive KCSAN failures because as far as KCSAN knows, this > cmpxchg() operation is updating all four bytes. > > This commit therefore encloses the cmpxchg() in a data_race() and adds > a single-byte instrument_atomic_read_write(), thus telling KCSAN exactly > what is going on so as to avoid the false positives. > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > Cc: Marco Elver <elver@xxxxxxxxxx> > > --- > > Is this really the right way to do this? This code has a real data race per definition of data race, right? KCSAN instruments the primitive precisely per its real semantics, but the desired semantics does not match the real semantics. As such, to me the right way would be implementing cmpxchgb(). Otherwise, the workaround below is perfectly adequate. > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > index d5319bbe8c982..e83adcdb49b5f 100644 > --- a/kernel/rcu/tasks.h > +++ b/kernel/rcu/tasks.h > @@ -1460,6 +1460,7 @@ static void rcu_st_need_qs(struct task_struct *t, u8 v) > /* > * Do a cmpxchg() on ->trc_reader_special.b.need_qs, allowing for > * the four-byte operand-size restriction of some platforms. > + * > * Returns the old value, which is often ignored. > */ > u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new) > @@ -1471,7 +1472,13 @@ u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new) > if (trs_old.b.need_qs != old) > return trs_old.b.need_qs; > trs_new.b.need_qs = new; > - ret.s = cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s); > + > + // Although cmpxchg() appears to KCSAN to update all four bytes, > + // only the .b.need_qs byte actually changes. > + instrument_atomic_read_write(&t->trc_reader_special.b.need_qs, > + sizeof(t->trc_reader_special.b.need_qs)); > + ret.s = data_race(cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s)); > + > return ret.b.need_qs; > } > EXPORT_SYMBOL_GPL(rcu_trc_cmpxchg_need_qs); >