Hello, Joel! > Hi Uladzislau, > > On Thu, Feb 27, 2025 at 02:16:13PM +0100, Uladzislau Rezki (Sony) wrote: > > Switch for using of get_state_synchronize_rcu_full() and > > poll_state_synchronize_rcu_full() pair to debug a normal > > synchronize_rcu() call. > > > > Just using "not" full APIs to identify if a grace period is > > passed or not might lead to a false-positive kernel splat. > > > > It can happen, because get_state_synchronize_rcu() compresses > > both normal and expedited states into one single unsigned long > > value, so a poll_state_synchronize_rcu() can miss GP-completion > > when synchronize_rcu()/synchronize_rcu_expedited() concurrently > > run. > > Agreed, I provided a scenario below but let me know if I missed anything. > > > To address this, switch to poll_state_synchronize_rcu_full() and > > get_state_synchronize_rcu_full() APIs, which use separate variables > > for expedited and normal states. > > Reviewed-by: Joel Fernandes <joelagnelf@xxxxxxxxxx> > > For completeness and just to clarify how this may happen, firstly as noted: > rcu_poll_gp_seq_start/end() is called for both begin/end of normal and exp > GPs thus compressing the use of the rcu_state.gp_seq_polled counter for > both normal and exp GPs. > > Then if we intersperse synchronize_rcu() with synchronize_rcu_expedited(), > something like the following may happen. > > CPU 0 CPU 1 > > synchronize_rcu_expedited() > // -> rcu_poll_gp_seq_start() > // This does rcu_seq_start on the > // gp_seq_polled and > // notes the started gp_seq_polled > // (say its 5) > synchronize_rcu() > -> synchronize_rcu_normal() > -> rs.head.func = > get_state_synchronize_rcu(); > // saves the value 12 > > > -> rcu_gp_init() > -> rcu_poll_gp_seq_start() > // rcu_seq_start does nothing > // but notes the pre-started > // gp_seq_polled (value 5) > > -> rcu_gp_cleanup() > // -> rcu_poll_gp_seq_end() > // ends the gp_seq_polled since it > // matches prior saved gp_seq_polled (5) > // new gp_seq_polled is 8. > > /* NORMAL GP COMPLETES */ > > rcu_gp_cleanup() > -> rcu_sr_normal_gp_cleanup() > -> rcu_sr_normal_complete() > -> poll_state_synchronize_rcu() > -> returns FALSE because gp_seq_polled is still 8. > -> Warning (false positive) > > Thank you for clarification, this is good for archive :) -- Uladzislau Rezki