On Wed, Oct 11, 2023 at 11:26:29AM -0700, Yury Norov wrote: > Long story short: KCSAN found some potential issues related to how > people use bitmap API. And instead of working through that issues, > the following code shuts down KCSAN by applying READ_ONCE() here > and there. Pfft. > READ_ONCE() fixes nothing because nothing is broken in find_bit() API. > As I suspected, and as Matthew confirmed in his email, the true reason > for READ_ONCE() here is to disable KCSAN check: > > READ_ONCE() serves two functions here; > one is that it tells the compiler not to try anything fancy, and > the other is that it tells KCSAN to not bother instrumenting this > load; no load-delay-reload. > > https://lkml.kernel.org/linux-mm/ZQkhgVb8nWGxpSPk@xxxxxxxxxxxxxxxxxxxx/ > > And as side-effect, it of course hurts the performance. In the same > email Matthew said he doesn't believe me that READ_ONCE would do that, > so thank you for testing and confirming that it does. You really misinterpreted what Jan wrote to accomplish this motivated reasoning. > Jan, I think that in your last email you confirmed that the xarray > problem that you're trying to solve is about a lack of proper locking: > > Well, for xarray the write side is synchronized with a spinlock but the read > side is not (only RCU protected). > > https://lkml.kernel.org/linux-mm/20230918155403.ylhfdbscgw6yek6p@quack3/ > > If there's no enough synchronization, why not just adding it? You don't understand. We _intend_ for there to be no locking. We_understand_ there is a race here. We're _fine_ with there being a race here. > Regardless performance consideration, my main concern is that this patch > considers bitmap as an atomic structure, which is intentionally not. > There are just a few single-bit atomic operations like set_bit() and > clear_bit(). All other functions are non-atomic, including those > find_bit() operations. ... and for KCSAN to understand that, we have to use READ_ONCE. > There is quite a lot of examples of wrong use of bitmaps wrt > atomicity, the most typical is like: > for(idx = 0; idx < num; idx++) { > ... > set_bit(idx, bitmap); > } > > This is wrong because a series of atomic ops is not atomic itself, and > if you see something like this in you code, it should be converted to > using non-atomic __set_bit(), and protected externally if needed. That is a bad use of set_bit()! I agree! See, for example, commit b21866f514cb where I remove precisely this kind of code. > Similarly, READ_ONCE() in a for-loop doesn't guarantee any ordering or > atomicity, and only hurts the performance. And this is exactly what > this patch does. Go back and read Jan's patch again, instead of cherry-picking some little bits that confirm your prejudices.