On Wed, May 15, 2024 at 07:40:08PM +0200, Marco Elver wrote: > On Wed, 15 May 2024 at 17:57, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Wed, May 15, 2024 at 09:58:35AM +0200, Marco Elver wrote: > > > On Wed, 15 May 2024 at 01:47, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > > > On Mon, May 13, 2024 at 10:13:49AM +0200, Marco Elver wrote: > > > > > On Sat, 11 May 2024 at 02:41, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > > > > [...] > > > > > > ------------------------------------------------------------------------ > > > > > > > > > > > > commit 930cb5f711443d8044e88080ee21b0a5edda33bd > > > > > > Author: Paul E. McKenney <paulmck@xxxxxxxxxx> > > > > > > Date: Fri May 10 15:36:57 2024 -0700 > > > > > > > > > > > > kcsan: Add example to data_race() kerneldoc header > > > > > > > > > > > > Although the data_race() kerneldoc header accurately states what it does, > > > > > > some of the implications and usage patterns are non-obvious. Therefore, > > > > > > add a brief locking example and also state how to have KCSAN ignore > > > > > > accesses while also preventing the compiler from folding, spindling, > > > > > > or otherwise mutilating the access. > > > > > > > > > > > > [ paulmck: Apply Bart Van Assche feedback. ] > > > > > > > > > > > > Reported-by: Bart Van Assche <bvanassche@xxxxxxx> > > > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > > > > > Cc: Marco Elver <elver@xxxxxxxxxx> > > > > > > Cc: Breno Leitao <leitao@xxxxxxxxxx> > > > > > > Cc: Jens Axboe <axboe@xxxxxxxxx> > > > > > > > > > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > > > > > index c00cc6c0878a1..9249768ec7a26 100644 > > > > > > --- a/include/linux/compiler.h > > > > > > +++ b/include/linux/compiler.h > > > > > > @@ -194,9 +194,17 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > > > > > * This data_race() macro is useful for situations in which data races > > > > > > * should be forgiven. One example is diagnostic code that accesses > > > > > > * shared variables but is not a part of the core synchronization design. > > > > > > + * For example, if accesses to a given variable are protected by a lock, > > > > > > + * except for diagnostic code, then the accesses under the lock should > > > > > > + * be plain C-language accesses and those in the diagnostic code should > > > > > > + * use data_race(). This way, KCSAN will complain if buggy lockless > > > > > > + * accesses to that variable are introduced, even if the buggy accesses > > > > > > + * are protected by READ_ONCE() or WRITE_ONCE(). > > > > > > * > > > > > > - * This macro *does not* affect normal code generation, but is a hint > > > > > > - * to tooling that data races here are to be ignored. > > > > > > + * This macro *does not* affect normal code generation, but is a hint to > > > > > > + * tooling that data races here are to be ignored. If code generation must > > > > > > + * be protected *and* KCSAN should ignore the access, use both data_race() > > > > > > > > > > "code generation must be protected" seems ambiguous, because > > > > > protecting code generation could mean a lot of different things to > > > > > different people. > > > > > > > > > > The more precise thing would be to write that "If the access must be > > > > > atomic *and* KCSAN should ignore the access, use both ...". > > > > > > > > Good point, and I took your wording, thank you. > > > > > > > > > I've also had trouble in the past referring to "miscompilation" or > > > > > similar, because that also entirely depends on the promised vs. > > > > > expected semantics: if the code in question assumes for the access to > > > > > be atomic, the compiler compiling the code in a way that the access is > > > > > no longer atomic would be a "miscompilation". Although is it still a > > > > > "miscompilation" if the compiler generated code according to specified > > > > > language semantics (say according to our own LKMM) - and that's where > > > > > opinions can diverge because it depends on which side we stand > > > > > (compiler vs. our code). > > > > > > > > Agreed, use of words like "miscompilation" can annoy people. What > > > > word would you suggest using instead? > > > > > > Not sure. As suggested above, I try to just stick to "atomic" vs > > > "non-atomic" because that's ultimately the functional end result of > > > such a miscompilation. Although I've also had people be confused as in > > > "what atomic?! as in atomic RMW?!", but I don't know how to remove > > > that kind of confusion. > > > > > > If, however, our intended model is the LKMM and e.g. a compiler breaks > > > a dependency-chain, then we could talk about miscompilation, because > > > the compiler violates our desired language semantics. Of course the > > > compiler writers then will say that we try to do things that are > > > outside any implemented language semantics the compiler is aware of, > > > so it's not a miscompilation again. So it all depends on which side > > > we're arguing for. Fun times. > > > > ;-) ;-) ;-) > > > > > > > > + * and READ_ONCE(), for example, data_race(READ_ONCE(x)). > > > > > > > > > > Having more documentation sounds good to me, thanks for adding! > > > > > > > > > > This extra bit of documentation also exists in a longer form in > > > > > access-marking.txt, correct? I wonder how it would be possible to > > > > > refer to it, in case the reader wants to learn even more. > > > > > > > > Good point, especially given that I had forgotten about it. > > > > > > > > I don't have any immediate ideas for calling attention to this file, > > > > but would the following update be helpful? > > > > > > Mentioning __data_racy along with data_race() could be helpful, thank > > > you. See comments below. > > > > I did add a mention of it in "Linux-Kernel RCU Shared-Variable Marking" > > [1], but just a mention, given that I do not expect that we will use it > > within RCU. > > > > > Thanks, > > > -- Marco > > > > > > > Thanx, Paul > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt > > > > index 65778222183e3..690dd59b7ac59 100644 > > > > --- a/tools/memory-model/Documentation/access-marking.txt > > > > +++ b/tools/memory-model/Documentation/access-marking.txt > > > > @@ -24,6 +24,12 @@ The Linux kernel provides the following access-marking options: > > > > 4. WRITE_ONCE(), for example, "WRITE_ONCE(a, b);" > > > > The various forms of atomic_set() also fit in here. > > > > > > > > +5. ASSERT_EXCLUSIVE_ACCESS() and ASSERT_EXCLUSIVE_WRITER(). > > > > > > Perhaps worth mentioning, but they aren't strictly access-marking > > > options. In the interest of simplicity could leave it out. > > > > Interestingly enough, they can be said to be implicitly marking other > > concurrent accesses to the variable. ;-) > > The document starts with "guidelines for marking intentionally > concurrent normal accesses to shared memory". The ASSERT_EXCLUSIVE > macros do capture more of the concurrency rules, and perhaps they > could be seen as some kind of "negative marking" where concurrent > access should _not_ happen concurrently with these. But I'm still not > convinced it's the same kind of marking the document introduces. > > I always considered them in the realm of general assertions that we > can just use to tell the tool more than can be inferred from the bits > of C code required for the functional implementation of whatever we're > doing. > > > I believe that the do need to be mentioned more prominently, though. > > > > Maybe a second list following this one? In that case, what do we name > > the list? I suppose the other alternative would be to leave them in > > this list, and change the preceding sentence to say something like > > "The Linux kernel provides the following access-marking-related primitives" > > > > Thoughts? > > And I just checked the current access-marking.txt to see where we > might add more, and found the section "ACCESS-DOCUMENTATION OPTIONS" > already exists. I think that section is perfectly reasonable as is, > and it does explicitly talk about ASSERT_EXCLUSIVE* macros. > > Did you want to add it more prominently at the top? If so, maybe a > brief forward-reference to that section might be helpful. How about like this? ------------------------------------------------------------------------ The Linux kernel provides the following access-marking options: 1. Plain C-language accesses (unmarked), for example, "a = b;" 2. Data-race marking, for example, "data_race(a = b);" 3. READ_ONCE(), for example, "a = READ_ONCE(b);" The various forms of atomic_read() also fit in here. 4. WRITE_ONCE(), for example, "WRITE_ONCE(a, b);" The various forms of atomic_set() also fit in here. 5. __data_racy, for example "int __data_racy a;" 6. KCSAN's negative-marking assertions, ASSERT_EXCLUSIVE_ACCESS() and ASSERT_EXCLUSIVE_WRITER(), are desccribed in the "ACCESS-DOCUMENTATION OPTIONS" section below. ------------------------------------------------------------------------ Would that work? Thanx, Paul