> On Apr 14, 2021, at 7:06 AM, Vlastimil Babka <vbabka@xxxxxxx> wrote: > > On 4/14/21 9:32 AM, Christoph Lameter wrote: >> On Tue, 13 Apr 2021, Gong, Sishuai wrote: >> >>> We found a racy reading spot on shared variable n->free_objects in > > I'm assuming this was found with some research tool you're developing? > Did it also flag the line "shared = READ_ONCE(n->shared);" as that's basically > the same thing. Yes, the read of n->shared is also racy but we didn’t observe any data races on it. We saw cache_alloc_refill() could run in parallel with multi writers (free_block(), cache_grow_end() and even itself cache_alloc_refill()), but none of these writers would change n->shared. > >>> slab.c and it can be data-racing with several writers that update this >>> variable. As shown below, in function cache_alloc_refill(), >>> n->free_objects will be read without any protection. It could be >>> possible that the read value immediately becomes out-of-date when >>> another writer is changing it (e.g. free_block()) >> >> Ok that is fine. If we mistakenly fill up the per cpu cache with new >> objects to the slab then so be it. > >> If we mistakenly take the lock and fail to get an object then we can still >> reverse that decision and do the other thing. > > Agreed. It's common in the kernel to do optimistic reads outside of lock to > decide what to do and avoid locking at all in some cases. This may sacrifice > some precision of these decisions. but not correctness, as locks are taken later > for the critical parts. > >> Maybe we need to add a comment there? > > Or maybe some construct that makes no difference for the compiler, but does for > the tool? READ_ONCE() maybe?