On Mon, May 13, 2024 at 10:13:49AM +0200, Marco Elver wrote: > On Sat, 11 May 2024 at 02:41, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > [...] > > ------------------------------------------------------------------------ > > > > commit 930cb5f711443d8044e88080ee21b0a5edda33bd > > Author: Paul E. McKenney <paulmck@xxxxxxxxxx> > > Date: Fri May 10 15:36:57 2024 -0700 > > > > kcsan: Add example to data_race() kerneldoc header > > > > Although the data_race() kerneldoc header accurately states what it does, > > some of the implications and usage patterns are non-obvious. Therefore, > > add a brief locking example and also state how to have KCSAN ignore > > accesses while also preventing the compiler from folding, spindling, > > or otherwise mutilating the access. > > > > [ paulmck: Apply Bart Van Assche feedback. ] > > > > Reported-by: Bart Van Assche <bvanassche@xxxxxxx> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > Cc: Marco Elver <elver@xxxxxxxxxx> > > Cc: Breno Leitao <leitao@xxxxxxxxxx> > > Cc: Jens Axboe <axboe@xxxxxxxxx> > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > index c00cc6c0878a1..9249768ec7a26 100644 > > --- a/include/linux/compiler.h > > +++ b/include/linux/compiler.h > > @@ -194,9 +194,17 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, > > * This data_race() macro is useful for situations in which data races > > * should be forgiven. One example is diagnostic code that accesses > > * shared variables but is not a part of the core synchronization design. > > + * For example, if accesses to a given variable are protected by a lock, > > + * except for diagnostic code, then the accesses under the lock should > > + * be plain C-language accesses and those in the diagnostic code should > > + * use data_race(). This way, KCSAN will complain if buggy lockless > > + * accesses to that variable are introduced, even if the buggy accesses > > + * are protected by READ_ONCE() or WRITE_ONCE(). > > * > > - * This macro *does not* affect normal code generation, but is a hint > > - * to tooling that data races here are to be ignored. > > + * This macro *does not* affect normal code generation, but is a hint to > > + * tooling that data races here are to be ignored. If code generation must > > + * be protected *and* KCSAN should ignore the access, use both data_race() > > "code generation must be protected" seems ambiguous, because > protecting code generation could mean a lot of different things to > different people. > > The more precise thing would be to write that "If the access must be > atomic *and* KCSAN should ignore the access, use both ...". Good point, and I took your wording, thank you. > I've also had trouble in the past referring to "miscompilation" or > similar, because that also entirely depends on the promised vs. > expected semantics: if the code in question assumes for the access to > be atomic, the compiler compiling the code in a way that the access is > no longer atomic would be a "miscompilation". Although is it still a > "miscompilation" if the compiler generated code according to specified > language semantics (say according to our own LKMM) - and that's where > opinions can diverge because it depends on which side we stand > (compiler vs. our code). Agreed, use of words like "miscompilation" can annoy people. What word would you suggest using instead? > > + * and READ_ONCE(), for example, data_race(READ_ONCE(x)). > > Having more documentation sounds good to me, thanks for adding! > > This extra bit of documentation also exists in a longer form in > access-marking.txt, correct? I wonder how it would be possible to > refer to it, in case the reader wants to learn even more. Good point, especially given that I had forgotten about it. I don't have any immediate ideas for calling attention to this file, but would the following update be helpful? Thanx, Paul ------------------------------------------------------------------------ diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt index 65778222183e3..690dd59b7ac59 100644 --- a/tools/memory-model/Documentation/access-marking.txt +++ b/tools/memory-model/Documentation/access-marking.txt @@ -24,6 +24,12 @@ The Linux kernel provides the following access-marking options: 4. WRITE_ONCE(), for example, "WRITE_ONCE(a, b);" The various forms of atomic_set() also fit in here. +5. ASSERT_EXCLUSIVE_ACCESS() and ASSERT_EXCLUSIVE_WRITER(). + +6. The volatile keyword, for example, "int volatile a;" + +7. __data_racy, for example "int __data_racy a;" + These may be used in combination, as shown in this admittedly improbable example: @@ -205,6 +211,27 @@ because doing otherwise prevents KCSAN from detecting violations of your code's synchronization rules. +Use of volatile and __data_racy +------------------------------- + +Adding the volatile keyword to the declaration of a variable causes both +the compiler and KCSAN to treat all reads from that variable as if they +were protected by READ_ONCE() and all writes to that variable as if they +were protected by WRITE_ONCE(). + +Adding the __data_racy type qualifier to the declaration of a variable +causes KCSAN to treat all accesses to that variable as if they were +enclosed by data_race(). However, __data_racy does not affect the +compiler, though one could imagine hardened kernel builds treating the +__data_racy type qualifier as if it was the volatile keyword. + +Note well that __data_racy is subject to the same pointer-declaration +rules as is the volatile keyword. For example: + + int __data_racy *p; // Pointer to data-racy data. + int *__data_racy p; // Data-racy pointer to non-data-racy data. + + ACCESS-DOCUMENTATION OPTIONS ============================ @@ -342,7 +369,7 @@ as follows: Because foo is read locklessly, all accesses are marked. The purpose of the ASSERT_EXCLUSIVE_WRITER() is to allow KCSAN to check for a buggy -concurrent lockless write. +concurrent write, whether marked or not. Lock-Protected Writes With Heuristic Lockless Reads