On Wed, Aug 31, 2022 at 01:38:35PM -0400, Alan Stern wrote: > On Wed, Aug 31, 2022 at 06:42:05PM +0200, Paul Heidekrüger wrote: > > On 31. Aug 2022, at 03:57, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > > On Tue, Aug 30, 2022 at 05:12:33PM -0400, Joel Fernandes wrote: > > >> On 8/30/2022 5:08 PM, Joel Fernandes wrote: > > >>> On 8/30/2022 4:44 PM, Paul Heidekrüger wrote: > > >>>> The current informal control dependency definition in explanation.txt is > > >>>> too broad and, as dicsussed, needs to be updated. > > >>>> > > >>>> Consider the following example: > > >>>> > > >>>>> if(READ_ONCE(x)) > > >>>>> return 42; > > >>>>> > > >>>>> WRITE_ONCE(y, 42); > > >>>>> > > >>>>> return 21; > > >>>> > > >>>> The read event determines whether the write event will be executed "at > > >>>> all" - as per the current definition - but the formal LKMM does not > > >>>> recognize this as a control dependency. > > >>>> > > >>>> Introduce a new defintion which includes the requirement for the second > > >>>> memory access event to syntactically lie within the arm of a non-loop > > >>>> conditional. > > >>>> > > >>>> Link: https://lore.kernel.org/all/20220615114330.2573952-1-paul.heidekrueger@xxxxxxxxx/ > > >>>> Cc: Marco Elver <elver@xxxxxxxxxx> > > >>>> Cc: Charalampos Mainas <charalampos.mainas@xxxxxxxxx> > > >>>> Cc: Pramod Bhatotia <pramod.bhatotia@xxxxxxxxx> > > >>>> Cc: Soham Chakraborty <s.s.chakraborty@xxxxxxxxxx> > > >>>> Cc: Martin Fink <martin.fink@xxxxxxxxx> > > >>>> Signed-off-by: Paul Heidekrüger <paul.heidekrueger@xxxxxxxxx> > > >>>> Co-developed-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> > > >>>> --- > > >>>> > > >>>> @Alan: > > >>>> > > >>>> Since I got it wrong the last time, I'm adding you as a co-developer after my > > >>>> SOB. I'm sorry if this creates extra work on your side due to you having to > > >>>> resubmit the patch now with your SOB if I understand correctly, but since it's > > >>>> based on your wording from the other thread, I definitely wanted to give you > > >>>> credit. > > >>>> > > >>>> tools/memory-model/Documentation/explanation.txt | 7 ++++--- > > >>>> 1 file changed, 4 insertions(+), 3 deletions(-) > > >>>> > > >>>> diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt > > >>>> index ee819a402b69..0bca50cac5f4 100644 > > >>>> --- a/tools/memory-model/Documentation/explanation.txt > > >>>> +++ b/tools/memory-model/Documentation/explanation.txt > > >>>> @@ -464,9 +464,10 @@ to address dependencies, since the address of a location accessed > > >>>> through a pointer will depend on the value read earlier from that > > >>>> pointer. > > >>>> > > >>>> -Finally, a read event and another memory access event are linked by a > > >>>> -control dependency if the value obtained by the read affects whether > > >>>> -the second event is executed at all. Simple example: > > >>>> +Finally, a read event X and another memory access event Y are linked by > > >>>> +a control dependency if Y syntactically lies within an arm of an if, > > >>>> +else or switch statement and the condition guarding Y is either data or > > >>>> +address-dependent on X. Simple example: > > > > Thank you both for commenting! > > > > > "if, else or switch" should be just "if or switch". In C there is no > > > such thing as an "else" statement; an "else" clause is merely part of > > > an "if" statement. In fact, maybe "body" would be more appropriate than > > > "arm", because "switch" statements don't have arms -- they have cases. > > > > Right. What do you think of "branch"? "Body" to me suggests that there's > > only one and therefore that the else clause isn't included. > > > > Would it be fair to say that switch statements have branches? I guess > > because switch statements are a convenient way of writing goto's, i.e. > > jumps, it's a stretch and basically the same as saying "arm"? > > > > Maybe we can avoid the arm / case clash by just having a definition for if > > statements and appending something like "similarly for switch statements"? > > That sounds good. > > > >>> 'conditioning guarding Y' sounds confusing to me as it implies to me that the > > >>> condition's evaluation depends on Y. I much prefer Alan's wording from the > > >>> linked post saying something like 'the branch condition is data or address > > >>> dependent on X, and Y lies in one of the arms'. > > >>> > > >>> I have to ask though, why doesn't this imply that the second instruction never > > >>> executes at all? I believe that would break the MP-pattern if it were not true. > > >> > > >> About my last statement, I believe your patch does not disagree with the > > >> correctness of the earlier text but just wants to improve it. If that's case > > >> then that's fine. > > > > > > The biggest difference between the original text and Paul's suggested > > > update is that the new text makes clear that Y has to lie within the > > > body of the "if" or "switch" statement. If Y follows the end of the > > > if/else, as in the example at the top of this email, then it does have > > > not a control dependency on X (at least, not via that if/else), even > > > though the value read by X does determine whether or not Y will execute. > > > > > > [It has to be said that this illustrates a big weakness of the LKMM: It > > > isn't cognizant of "goto"s or "return"s. This naturally derives from > > > limitations of the herd tool, but the situation could be improved. So > > > for instance, I don't think it would cause trouble to say that in: > > > > > > if (READ_ONCE(x) == 0) > > > return; > > > WRITE_ONCE(y, 5); > > > > > > there really is a control dependence from x to y, even though the > > > WRITE_ONCE is outside the body of the "if" statement. Certainly the > > > compiler can't reorder the write before the read. But AFAIK there's no > > > way to include a "return" statement in a litmus test for herd. Or a > > > subroutine definition, for that matter.] > > > > > > I agree that "condition guarding Y" is somewhat awkward. "the > > > condition of the if (or the expression of the switch)" might be better, > > > even though it is somewhat awkward as well. At least it's more > > > explicit. > > > > Maybe we can reuse the wording from the data and address dependency > > definition here and say "affects"? > > > > Putting it all together: > > > > > Finally, a read event X and another memory access event Y are linked by a > > > control dependency if Y syntactically lies within a branch of an if or > > > switch statement and X affects the evaluation of that statement's > > > condition via a data or address dependency. > > > > Alternatively without the arm / case clash: > > > > > Finally, a read event X and another memory access event Y are linked by a > > > control dependency if Y syntactically lies within an arm of an if > > > statement and X affects the evaluation of the if condition via a data or > > > address dependency. Similarly for switch statements. > > > > What do you think? > > I like the second one. How about combining the last two sentences? > > ... via a data or address dependency (or similarly for a switch > statement). > > Now I suppose someone will pipe up and ask about the conditional > expressions in "for", "while" and "do" statements... :-) What? You don't like setjmp() and longjmp()? ;-) Thanx, Paul