On Fri 06-10-23 16:39:54, Mirsad Todorovac wrote: > On 9/19/2023 6:20 AM, Matthew Wilcox wrote: > > On Mon, Sep 18, 2023 at 11:56:36AM -0700, Yury Norov wrote: > > > Guys, I lost the track of the conversation. In the other email Mirsad > > > said: > > > Which was the basic reason in the first place for all this, because something changed > > > data from underneath our fingers .. > > > > > > It sounds clearly to me that this is a bug in xarray, *revealed* by > > > find_next_bit() function. But later in discussion you're trying to 'fix' > > > find_*_bit(), like if find_bit() corrupted the bitmap, but it's not. > > > > No, you're really confused. That happens. > > > > KCSAN is looking for concurrency bugs. That is, does another thread > > mutate the data "while" we're reading it. It does that by reading > > the data, delaying for a few instructions and reading it again. If it > > changed, clearly there's a race. That does not mean there's a bug! > > > > Some races are innocuous. Many races are innocuous! The problem is > > that compilers sometimes get overly clever and don't do the obvious > > thing you ask them to do. READ_ONCE() serves two functions here; > > one is that it tells the compiler not to try anything fancy, and > > the other is that it tells KCSAN to not bother instrumenting this > > load; no load-delay-reload. > > > > > In previous email Jan said: > > > for any sane compiler the generated assembly with & without READ_ONCE() > > > will be exactly the same. > > > > > > If the code generated with and without READ_ONCE() is the same, the > > > behavior would be the same, right? If you see the difference, the code > > > should differ. > > > > Hopefully now you understand why this argument is wrong ... > > > > > You say that READ_ONCE() in find_bit() 'fixes' 200 KCSAN BUG warnings. To > > > me it sounds like hiding the problems instead of fixing. If there's a race > > > between writing and reading bitmaps, it should be fixed properly by > > > adding an appropriate serialization mechanism. Shutting Kcsan up with > > > READ_ONCE() here and there is exactly the opposite path to the right direction. > > > > Counterpoint: generally bitmaps are modified with set_bit() which > > actually is atomic. We define so many bitmap things as being atomic > > already, it doesn't feel like making find_bit() "must be protected" > > as a useful use of time. > > > > But hey, maybe I'm wrong. Mirsad, can you send Yury the bug reports > > for find_bit and friends, and Yury can take the time to dig through them > > and see if there are any real races in that mess? > > > > > Every READ_ONCE must be paired with WRITE_ONCE, just like atomic > > > reads/writes or spin locks/unlocks. Having that in mind, adding > > > READ_ONCE() in find_bit() requires adding it to every bitmap function > > > out there. And this is, as I said before, would be an overhead for > > > most users. > > > > I don't believe you. Telling the compiler to stop trying to be clever > > rarely results in a performance loss. > > Hi Mr. Wilcox, > > Do you think we should submit a formal patch for this data-race? So I did some benchmarking with various GCC versions and the truth is that READ_ONCE() does affect code generation a bit (although the original code does not refetch the value from memory). As a result my benchmarks show the bit searching functions are about 2% slower. This is not much but it is stupid to cause a performance regression due to non-issue. I'm trying to get some compiler guys look into this whether we can improve it somehow... Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR