Re: [PATCH RFC] srcu: Yet more detail for srcu_readers_active_idx_check() comments

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Fri, 16 Dec 2022 11:54:19 -0500

> On Dec 16, 2022, at 11:51 AM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> 
> On Fri, Dec 16, 2022 at 04:32:39PM +0000, Joel Fernandes wrote:
>> On Thu, Dec 15, 2022 at 05:09:14PM -0800, Paul E. McKenney wrote:
>> [...]
>>>>>> 2. unlock()'s smp_mb() happened before Flip+smp_mb() , now the reader
>>>>>> has no new smp_mb() that happens AFTER the flip happened. So it can
>>>>>> totally sample the old idx again -- that particular reader will
>>>>>> increment twice, but the next time, it will see the flipped one.
>>>>> 
>>>>> I will let you transliterate both.  ;-)
>>>> 
>>>> I think I see what you mean now :)
>>>> 
>>>> I believe the access I am referring to is the read of idx on one side and
>>>> the write to idx on the other. However that is incomplete and I need to
>>>> pair that with some of other access on both sides.
>>>> 
>>>> So perhaps this:
>>>> 
>>>> Writer does flip + smp_mb + read unlock counts [1]
>>>> 
>>>> Reader does:
>>>> read idx + smp_mb() + increment lock counts [2]
>>>> 
>>>> And subsequently reader does
>>>> Smp_mb() + increment unlock count. [3]
>>>> 
>>>> So [1] races with either [2] or [2]+[3].
>>>> 
>>>> Is that fair?
>>> 
>>> That does look much better, thank you!
>> 
>> Perhaps a comment with an ASCII diagram will help?
>> 
>> 
>> Case 2:
>> Both the reader and the updater see each other's writes too late, but because
>> of memory barriers on both sides, they will eventually see each other's write
>> with respect to their own. This is similar to the store-buffer problem. This
>> let's a single reader contribute a maximum (unlock minus lock) imbalance of 2.
>> 
>> The following diagram shows the subtle worst case followed by a simplified
>> store-buffer explanation.
>> 
>> READER                  UPDATER
>> -------------           ----------
>>                           // idx is initially 0.
>> read_lock() {
>>  READ(idx) = 0;
>>  lock[0]++; --------------------------------------------,
>>                           flip() {                      |               
>>                              smp_mb();                  |
>>  smp_mb();                                              |
>> }                                                        |
>>                                                         |
>> // RSCS                                                  |
>>                                                         |
>> read_unlock() {                                          |
>>  smp_mb();                                              |
>>                              idx++;  // P               |
>>                              smp_mb();                  |
>>                           }                             |
>>                                                         |
>>                           scan_readers_idx(0) {         |
>>                               count all unlock[0];      |
>>                                   |                     |
>>                                   |                     |
>>  unlock[0]++; //X <--not-counted--`-----,               |
>>                                         |               |
>> }                                        V               `------,
>>                               // Will make sure next scan      |
>>                               // will not miss this unlock (X) |
>>                               // if other side saw flip (P) ,--`
>>                               // Call this MB [1]           |
>>                               // Order write(idx) with      |
>>                               // next scan's unlock.        |
>>                               smp_mb();                 ,---`
>> read_lock() {                                            |
>>  READ(idx)=0;                                           |
>>  lock[0]++; ----------------> count all lock[0];        |
>>  smp_mb();         |     }                              |
>> }     |             |                                    V
>>      |             `---> // Incorrect contribution to lock counting
>>      |                   // upto a maximum of 2 times.
>>      |
>>       `---> // Pairs with MB [1]. Makes sure that
>>             // the next read_lock()'s' idx read (Y) is ordered
>>             // with above write to unlock[0] (X).
>>                            |
>> rcu_read_unlock() {         |
>>  smp_mb(); <---------------`
>>  unlock[0]++; 
>> }
>> 
>> read_lock() {
>>  READ(idx) = 1; //Y
>>  lock[1]++;
>>  ...
>> }
>>                           scan_readers_idx(0) {
>>                               count all unlock[0]; //Q
>>                               ...
>> 
>> 
>> thanks,
>> 
>> - Joel
>> 
>>                          }
>> 
>> This makes it similar to the store buffer pattern. Using X, Y, P and Q
>> annotated above, we get:
>> 
>> READER                    UPDATER
>> X (write)                 P (write)
>> 
>> smp_mb();                 smp_mb();
>> 
>> Y (read)                  Q (read)
> 
> Given that this diagram is more than 50 lines long, it might go better in
> a design document describing this part of RCU.  Perhaps less detail or
> segmented, but the same general idea as this guy:
> 
> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst

Yes, this sounds like a good place to add it and perhaps we refer to it from the C source file? I can take this up to do over the holidays, if you prefer.

Thanks,

  - Joel

> 
> Thoughts?
> 
>                        Thanx, Paul