On Fri, 28 Feb 2020 at 19:54, Marco Elver <elver@xxxxxxxxxx> wrote: > > On Fri, 28 Feb 2020 at 18:24, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Fri, 28 Feb 2020, Marco Elver wrote: > > > > > For language-level memory consistency models that are adaptations of > > > data-race-free, the definition of "data race" can be summarized as > > > "concurrent conflicting accesses, where at least one is non-sync/plain". > > > > > > The definition of "conflict" should not include the type of access nor > > > whether the accesses are concurrent or not, which this patch addresses > > > for explanation.txt. > > > > Why shouldn't it? Can you provide any references to justify this > > assertion? > > The definition of "conflict" as we know it and is cited by various > papers on memory consistency models appeared in [1]: "Two accesses to > the same variable conflict if at least one is a write; two operations > conflict if they execute conflicting accesses." > > The LKMM as well as C11 are adaptations of data-race-free, which are > based on the work in [2]. Necessarily, we need both conflicting data > operations (plain) and synchronization operations (marked). C11's > definition is based on [3], which defines a "data race" as: "Two > memory operations conflict if they access the same memory location, > and at least one of them is a store, atomic store, or atomic > read-modify-write operation. In a sequentially consistent execution, > two memory operations from different threads form a type 1 data race > if they conflict, at least one of them is a data operation, and they > are adjacent in <T (i.e., they may be executed concurrently)." > > [1] D. Shasha, M. Snir, "Efficient and Correct Execution of Parallel > Programs that Share Memory", 1988. > URL: http://snir.cs.illinois.edu/listed/J21.pdf > > [2] S. Adve, "Designing Memory Consistency Models for Shared-Memory > Multiprocessors", 1993. > URL: http://sadve.cs.illinois.edu/Publications/thesis.pdf > > [3] H.-J. Boehm, S. Adve, "Foundations of the C++ Concurrency Memory > Model", 2008. > URL: https://www.hpl.hp.com/techreports/2008/HPL-2008-56.pdf > > > Also, note two things: (1) The existing text does not include > > concurrency in the definition of "conflict". (2) Your new text does > > include the type of access in the definition (you say that at least one > > of the accesses must be a write). > > Yes, "conflict" is defined in terms of "access to the same memory > location and at least one performs a write" (can be any operation that > performs a write, including RMWs etc.). It should not include > concurrency. We can have conflicting operations that are not > concurrent, but these will never be data races. > > > > The definition of "data race" remains unchanged, but the informal > > > definition for "conflict" is restored to what can be found in the > > > literature. > > > > It does not remain unchanged. You removed the portion that talks about > > accesses executing on different CPUs or threads. Without that > > restriction, you raise the nonsensical possibility that a single thread > > may by definition have a data race with itself (since modern CPUs use > > multiple-instruction dispatch, in which several instructions can > > execute at the same time). > > Andrea raised the point that "occur on different CPUs (or in different > threads on the same CPU)" can be interpreted as "in different threads > [even if they are serialized via some other synchronization]". > > Arguably, no sane memory model or abstract machine model permits > observable intra-thread concurrency of instructions in the same > thread. At the abstract machine level, whether or not there is true > parallelism shouldn't be something that the model concerns itself > with. Simply talking about "concurrency" is unambiguous, unless the > model says intra-thread concurrency is a thing. > > I can add it back if it helps make this clearer, but we need to mention both. > > > > Signed-by: Marco Elver <elver@xxxxxxxxxx> > > > --- > > > tools/memory-model/Documentation/explanation.txt | 15 ++++++--------- > > > 1 file changed, 6 insertions(+), 9 deletions(-) > > > > > > diff --git a/tools/memory-model/Documentation/explanation.txt b/tools/memory-model/Documentation/explanation.txt > > > index e91a2eb19592a..11cf89b5b85d9 100644 > > > --- a/tools/memory-model/Documentation/explanation.txt > > > +++ b/tools/memory-model/Documentation/explanation.txt > > > @@ -1986,18 +1986,15 @@ violates the compiler's assumptions, which would render the ultimate > > > outcome undefined. > > > > > > In technical terms, the compiler is allowed to assume that when the > > > -program executes, there will not be any data races. A "data race" > > > -occurs when two conflicting memory accesses execute concurrently; > > > -two memory accesses "conflict" if: > > > +program executes, there will not be any data races. A "data race" > > > > Unnecessary (and inconsistent with the rest of the document) whitespace > > change. > > Reverted. > > > > +occurs if: > > > > > > - they access the same location, > > > + two concurrent memory accesses "conflict"; > > > > > > - they occur on different CPUs (or in different threads on the > > > - same CPU), > > > + and at least one of the accesses is a plain access; > > > > > > - at least one of them is a plain access, > > > - > > > - and at least one of them is a store. > > > + where two memory accesses "conflict" if they access the same > > > + memory location, and at least one performs a write; > > > > > > The LKMM tries to determine whether a program contains two conflicting > > > accesses which may execute concurrently; if it does then the LKMM says > > > > To tell the truth, the only major change I can see here (apart from the > > "differenct CPUs" restriction) is that you want to remove the "at least > > one is plain" part from the definition of "conflict" and instead make > > it a separate requirement for a data race. That's fine with me in > > principle, but there ought to be an easier way of doing it. > > Yes pretty much. The model needs to be able to talk about "conflicting > synchronization accesses" where all accesses are marked. Right now the > definition of conflict doesn't permit that. > > > Furthermore, this section of explanation.txt goes on to use the words > > "conflict" and "conflicting" in a way that your patch doesn't address. > > For example, shortly after this spot it says "Determining whether two > > accesses conflict is easy"; you should change it to say "Determining > > whether two accesses conflict and at least one of them is plain is > > easy" -- but this looks pretty ungainly. A better approach might be to > > introduce a new term, define it to mean "conflicting accesses at least > > one of which is plain", and then use it instead throughout. > > The definition of "conflict" as used in the later text is synonymous > with "data race". Correction: it's "data race" minus "concurrent" which makes things more difficult. In which case, fixing this becomes more difficult. > > Alternatively, you could simply leave the text as it stands and just > > add a parenthetical disclaimer pointing out that in the CS literature, > > the term "conflict" is used even when both accesses are marked, so the > > usage here is somewhat non-standard. > > The definition of what a "conflict" is, is decades old [1, 2]. I > merely thought we should avoid changing fundamental definitions that > have not changed in decades, to avoid confusing people. The literature > on memory models is confusing enough, so fundamental definitions that > are "common ground" shouldn't be changed if it can be avoided. I think > here it is pretty trivial to avoid. > > Thanks, > -- Marco