On Thu 06-02-20 09:49:04, Jason Gunthorpe wrote: > On Wed, Feb 05, 2020 at 01:59:04PM -0800, Matthew Wilcox wrote: > > > > How is RCU mark reading used anyhow? > > > > We iterate over pages in the page cache with, eg, the dirty bit set. > > This bug will lead to the loop terminating early and failing to find > > dirty pages that it should. > > But the inhernetly weak sync of marks and pointers means that > iteration will miss some dirty pages and return some clean pages. It > is all OK for some reason? Yes. The reason is - the definitive source of dirtiness is in page flags so if iteration returns more pages, we don't care. WRT missing pages we only need to make sure pages that were dirty before the iteration started are returned and the current code fulfills that. > > > Actually the clearing of marks by xa_store(, NULL) is creating a very > > > subtle bug in drivers/infiniband/core/device.c :( Can you add a Fixes > > > line too: > > > > > > ib_set_client_data() is assuming the marks for the entry will not > > > change, but if the caller passed in NULL they get wrongly reset, and > > > three call sites pass in NULL: > > > drivers/infiniband/ulp/srpt/ib_srpt.c > > > net/rds/ib.c > > > net/smc/smc_ib.c > > > Fixes: 0df91bb67334 ("RDMA/devices: Use xarray to store the client_data") > > > > There's no bug here. > > > > If you're actually storing NULL in the array, then the marks would go > > away. That's inherent -- imagine you have an array with a single entry > > at 64. Then you store NULL there. That causes the node to be deleted, > > and the marks must necessarily disappear with it -- there's nowhere to > > store them! > > Ah, it is allocating! These little behavior differences are tricky to > remember over with infrequent use :( Yeah, that's why I'd prefer if NULL was not "special value" at all and if someone wanted to remove index from xarray he'd always have to use a special function. My patches go towards that direction but not the full way because there's still xa_cmpxchg() whose users use the fact that NULL is in fact 'erase'. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR