On Wed, Feb 19, 2025 at 08:29:47AM -0500, Joel Fernandes wrote: > > > On 2/19/2025 8:22 AM, Paul E. McKenney wrote: > > On Wed, Feb 19, 2025 at 07:43:08AM -0500, Joel Fernandes wrote: > >> poll_state_synchronize_srcu() uses rcu_seq_done() unlike > >> poll_state_synchronize_rcu() which uses rcu_seq_done_exact(). > >> > >> The rcu_seq_done_exact() makes more sense for polling API, as with > >> this API, there is a higher chance that there is a significant delay > >> between the get_state..() and poll_state..() calls since a cookie > >> can be stored and reused at a later time. During such a delay, if > >> the gp_seq counter progresses more than ULONG_MAX/2 distance, then > >> poll_state..() may return false for a long time unwantedly. > >> > >> Fix by using the more accurate rcu_seq_done_exact() API which is > >> exactly what straight RCU's polling does. > >> > >> It may make sense, as future work, to add debug code here as well, where > >> we compare a physical timestamp between get_state..() and poll_state() > >> calls and yell if significant time has past but the grace period has > >> still not progressed. > >> > >> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@xxxxxxx> > >> Signed-off-by: Joel Fernandes <joelagnelf@xxxxxxxxxx> > > > > Reviewed-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > > > But we should also run this by Kent Overstreet, given that bcachefs > > uses this. Should be OK, given that bcachefs uses this API in the same > > way that it does poll_state_synchronize_rcu(), but still... > > Thanks Paul! Adding Kent Overstreet to the email to raise any objections. It sounds like rcu_done_exact() is indeed what we want - bcachefs uses this for determining when objects may be reclaimed (as is typical with rcu), so we don't want objects to be stranded a "significant time past the grace period". Is there any additional cost? I'm not seeing rcu_done_exact() in Linus's tree yet. Minor additional overhead would be totally fine; we use this from fs/bcachefs/rcu_pending.c which doesn't call it for each object.