Re: Section 9.5: Nobody expects the Spanish Acquisition!

Akira Yokosawa <akiyks@xxxxxxxxx> · Thu, 23 Dec 2021 21:59:45 +0900

On Thu, 23 Dec 2021 07:46:02 -0500, Elad Lahav wrote:
> Yes, Akira, that's where I was going. Nevertheless, my question is a
> bit more profound than that: is the asymmetry a result of the
> implementation of RCU, as presented here (and the way it is used in
> the Linux kernel), or is it inherent to RCU as a data access pattern?
> I believe it is the former, and have an idea on how to phrase this
> notion more formally. I will write something up and you can then
> decide whether:
> 
> 1. I'm wrong
> 2. I'm right, but it's not worth mentioning
> 3. I'm right, and the point should be added to the book
> 4. None of the above ;-)

Nice!
Looking forward to seeing your write-up.

        Thanks, Akira

> 
> --Elad
> 
> On Thu, 23 Dec 2021 at 07:22, Akira Yokosawa <akiyks@xxxxxxxxx> wrote:
>>
>> Hi,
>>
>> It sounds to me like Paul is missing the main point of Elad's question.
>> So I'm chiming in again.  See inline comment below Elad's question.
>>
>> On Wed, 22 Dec 2021 18:29:57 -0800, Paul E. McKenney wrote:
>>> Strictly speaking, all that the rcu_read_lock()/rcu_read_unlock() does
>>> is to ensure that any RCU-protected linked data objects referenced by
>>> the corresponding critical section remain in existence for the full
>>> duration of that critical section.
>>>
>>> And even that overstates things a bit, as this describes not the
>>> underlying RCU API itself, but rather some of that API's use cases.
>>> (Though to be fair, these are the most popular of RCU's use cases.)
>>> So even more strictly speaking, all the rcu_read_lock()/rcu_read_unlock()
>>> does is to make any synchronize_rcu() or call_rcu() invocations starting
>>> after a given rcu_read_lock() wait (synchronously or asynchronously,
>>> respectively) until the corresponding rcu_read_unlock() is reached.
>>>
>>> Returning back to the first paragraph, additional protections can be
>>> arranged, depending on the RCU use case.  For example, if the non-pointer
>>> fields of objects added to an RCU-protected linked data structure remain
>>> unchanged while that object is accessible to readers, then the protection
>>> includes not just existence, but also value.  And there are other use
>>> cases where those values might change while accessible to readers.
>>>
>>> So the most accurate answer to your question is "That is a design
>>> choice, based on the RCU use case in question."
>>>
>>> My guess is that you are thinking in terms of designs where the
>>> non-pointer fields of an object are constant while that object is
>>> accessible to readers.  Is my guess correct?
>>>
>>>                                                       Thanx, Paul
>>>
>>> On Wed, Dec 22, 2021 at 04:35:24PM -0500, Elad Lahav wrote:
>>>> I will try to come up with something, but first I wanted to get your
>>>> opinion as to whether this design pattern, where you need to
>>>> dereference a pointer to a data structure containing all of the data
>>>> considered to be protected by the critical section, is inherent to
>>>> RCU, or is it just an idiosyncrasy of the implementation used in the
>>>> Linux kernel?
>>>>
>>>> --Elad
>>
>> I guess Elad is asking more of the design choice of rcu_assign_pointer()
>> and rcu_dereference().
>>
>> A naive reader might easily miss the asymmetry of rcu_assign_pointer()
>> with store-release and rcu_defeference() *without* load-acquire.
>>
>> IIUC, this asymmetry comes from the use cases where RCU performs best,
>> that is read-mostly.  Therefore, rcu_dereference() need to be
>> light-weight as possible and doesn't imply load-acquire.
>> Thankfully quite a lot of use cases can be covered by the pattern
>> where address-dependency is sufficient for read-side.
>>
>> So one approach for Elad's concern would be to add a Quick Quiz on
>> this asymmetry, I suppose.
>>
>> Elad, am I guessing right?
>>
>> Sidenote: RCU and memory accesses are orthogonal.  You can use whatever
>> memory access primitives for your needed memory ordering. (Which would
>> be quite likely a tricky thing to do for a naive reader, though.)
>>
>>         Thanks, Akira
>>
>>>>
>>>> On Wed, 22 Dec 2021 at 13:19, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>>>>>
>>>>> On Wed, Dec 22, 2021 at 11:35:02PM +0900, Akira Yokosawa wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Wed, 22 Dec 2021 07:15:54 -0500, Elad Lahav wrote:
>>>>>>> Hi Akira,
>>>>>>>
>>>>>>>> I think this is mostly covered in Section 9.5.2.1 "Publish-Subscribe
>>>>>>>> Mechanism" and Figure 9.10 "Publication/Subscription Constraints".
>>>>>>>
>>>>>>> I don't believe it is. I think you can infer that from reading sections 9.5
>>>>>>> and chapter 15, but it is never made explicit.
>>>>>>
>>>>>> I agree it is not explicit.
>>>>>>
>>>>>>>
>>>>>>> Sub-section 9.5.2.1 talks about the use of rcu_dereference() to obtain a
>>>>>>> pointer, but it doesn't go very deep into how that's implemented. You would
>>>>>>> think that the writer's use of store-release semantics would necessitate
>>>>>>> the reader's use of load-acquire semantics, but the discussion here, the
>>>>>>> previous examples, and a quick inspection of how rcu_dereference() is
>>>>>>> implemented, suggest that a READ_ONCE() is sufficient (or some less
>>>>>>> Linux-kernel-specific equivalent).
>>>>>>>
>>>>>>> If I am a naive reader (and a lazy one, who haven't read chapter 15 and made
>>>>>>> the necessary connection), I could write something like:
>>>>>>>
>>>>>>>     struct foo_s {
>>>>>>>         int a;
>>>>>>>     } foo;
>>>>>>>     int b;
>>>>>>>     struct foo_s *g_foop = &foo;
>>>>>>>
>>>>>>>     // Reader
>>>>>>>     rcu_read_lock();
>>>>>>>     struct foo *l_foop = rcu_dereference(g_foop);
>>>>>>>     use(l_foop->a);
>>>>>>>     use(b);
>>>>>>>     rcu_read_unlock();
>>>>>>>
>>>>>>>     // Writer
>>>>>>>     struct foo *l_foop = malloc(sizeof(struct foo));
>>>>>>>     l_foop->a = 1;
>>>>>>>     b = 2;
>>>>>>>     // Release semantics ensure previous stores are observed
>>>>>>>     rcu_assign_pointer(g_foop, l_foop);
>>>>>>>
>>>>>>> But since b has no address dependency to g_foop and since neither
>>>>>>> rcu_read_lock() nor rcu_dereference() impose acquire semantics, then
>>>>>>> the reader may not observe the new value of b.
>>>>>>
>>>>>> No, it may not.
>>>>>> So what you want is the mention of "address dependency" somewhere in
>>>>>> Section 9.5.2.1?
>>>>>>
>>>>>>>
>>>>>>> Again, I may be missing something, but this seems to be a major point
>>>>>>> that needs to be explained and emphasized.
>>>>>>
>>>>>> I guess Paul is intentionally avoiding discussions on memory ordering
>>>>>> here in Section 9.5.
>>>>>>
>>>>>> Such a discussion would easily frighten naive readers...
>>>>>>
>>>>>> Anyway, I guess Paul would have some good compromise to satisfy you.
>>>>>
>>>>> Actually, I am going to ask Elad to propose the location and wording of an
>>>>> addition to Section 9.5 covering this, so that we can discuss and refine
>>>>> it.  This addition might be a new paragraph, a footnote, a quick quiz,
>>>>> a citation, or what have you.  And the refining might switch back and
>>>>> forth among these options a few times.  But we do have to start somewhere.
>>>>>
>>>>> I am thinking in terms of this going in after the release that I was
>>>>> naively planning to do yesterday.  (The objective universe had other
>>>>> ideas.)
>>>>>
>>>>>>>> BTW, I couldn't figure out what you meant by "the Spanish
>>>>>>>> Acquisition"...
>>>>>>> It's a reference to an old Monty Python skit. Apologies for the silly pun...
>>>>>>
>>>>>> Ah, now I guess I see the pun of "acquire" and "acquisition".  ;-)
>>>>>
>>>>> And here I was hoping that Elad was purchasing a house in Barcelona
>>>>> or some such.  ;-)
>>>>>
>>>>> (Sorry, couldn't resist!)
>>>>>
>>>>>                                                         Thanx, Paul