On Mon, Nov 15, 2021 at 11:59AM -0800, Shakeel Butt wrote: > On Mon, Nov 15, 2021 at 10:55 AM Mina Almasry <almasrymina@xxxxxxxxxx> wrote: [...] > > Sorry I'm still a bit confused. READ_ONCE/WRITE_ONCE isn't documented > > to provide atomicity to the write or read, just prevents the compiler > > from re-ordering them. Is there something I'm missing, or is the > > suggestion to add READ_ONCE/WRITE_ONCE simply to supress the KCSAN > > warnings? It's actually the opposite: READ_ONCE/WRITE_ONCE provide very little ordering (modulo dependencies) guarantees, which includes ordering by compiler, but are supposed to provide atomicity (when used with properly aligned types up to word size [1]; see __READ_ONCE for non-atomic variant). Some more background... The warnings that KCSAN tells you about are "data races", which occur when you have conflicting concurrent accesses, one of which is "plain" and at least one write. I think [2] provides a reasonable summary of data races and why we should care. For Linux, our own memory model (LKMM) documents this [3], and says that as long as concurrent operations are marked (non-plain; e.g. *ONCE), there won't be any data races. There are multiple reasons why data races are undesirable, one of which is to avoid bad compiler transformations [4], because compilers are oblivious to concurrency otherwise. Why do marked operations avoid data races and prevent miscompiles? Among other things, because they should be executed atomically. If they weren't a lot of code would be buggy (there had been cases where the old READ_ONCE could be used on data larger than word size, which certainly weren't atomic, but this is no longer possible). [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/rwonce.h#n35 [2] https://lwn.net/Articles/816850/#Why%20should%20we%20care%20about%20data%20races? [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n1920 [4] https://lwn.net/Articles/793253/ Some rules of thumb when to use which marking: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt In an ideal world, we'd have all intentionally concurrent accesses marked. As-is, KCSAN will find: A. Data race, where failure due to current compilers is unlikely (supposedly "benign"); merely marking the accesses appropriately is sufficient. Finding a crash for these will require a miscompilation, but otherwise look "benign" at the C-language level. B. Race-condition bugs where the bug manifests as a data race, too -- simply marking things doesn't fix the problem. These are the types of bugs where a data race would point out a more severe issue. Right now we have way too much of type (A), which means looking for (B) requires patience. > +Paul & Marco > > Let's ask the experts. > > We have a "unsigned long usage" variable that is updated within a lock > (hugetlb_lock) but is read without the lock. > > Q1) I think KCSAN will complain about it and READ_ONCE() in the > unlocked read path should be good enough to silent KCSAN. So, the > question is should we still use WRITE_ONCE() as well for usage within > hugetlb_lock? KCSAN's default config will forgive the lack of WRITE_ONCE(). Technically it's still a data race (which KCSAN can find with a config change), but can be forgiven because compilers are less likely to cause trouble for writes (background: https://lwn.net/Articles/816854/ bit about "Unmarked writes (aligned and up to word size)..."). I would mark both if feasible, as it clearly documents the fact the write can be read concurrently. > Q2) Second question is more about 64 bit archs breaking a 64 bit write > into two 32 bit writes. Is this a real issue? If yes, then the > combination of READ_ONCE()/WRITE_ONCE() are good enough for the given > use-case? Per above, probably unlikely, but allowed. WRITE_ONCE should prevent it, and at least relieve you to not worry about it (and shift the burden to WRITE_ONCE's implementation). Thanks, -- Marco