On Wed, Jan 17, 2024 at 11:05 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > The name changes from Chris to Christopher are confusing :D > > > > > I think it makes the review easier. The code adding and removing does > > not have much overlap. Combining it to a single patch does not save > > patch size. Having the assert check would be useful for some bisecting > > to narrow down which step causing the problem. I am fine with squash > > it to one patch as well. > > I think having two patches is unnecessarily noisy, and we add some > debug code in this patch that we remove in the next patch anyway. > Let's see what others think, but personally I prefer a single patch. > > > > > > > > > > > > I expect to merge the zswap rb tree spin lock with the xarray > > > > lock in the follow up changes. > > > > > > Shouldn't this simply be changing uses of tree->lock to use > > > xa_{lock/unlock}? We also need to make sure we don't try to lock the > > > tree when operating on the xarray if the caller is already holding the > > > lock, but this seems to be straightforward enough to be done as part > > > of this patch or this series at least. > > > > > > Am I missing something? > > > > Currently the zswap entry refcount is protected by the zswap tree spin > > lock as well. Can't remove the tree spin lock without changing the > > refcount code. I think the zswap search entry should just return the > > entry with refcount atomic increase, inside the RCU read() or xarray > > lock. The previous zswap code does the find_and_get entry() which is > > closer to what I want. > > I think this can be done in an RCU read section surrounding xa_load() xa_load() already has RCU read lock inside. If you do that you might just as well use some XAS API to work with the lock directly. > and the refcount increment. Didn't look closely to check how much > complexity this adds to manage refcounts with RCU, but I think there > should be a lot of examples all around the kernel. The complexity is not adding the refcount inside xa_load(). It is on the zswap code that calls zswap_search() and zswap_{insert,erase}(). As far as I can tell, those codes need some tricky changes to go along with the refcount change. > > IIUC, there are no performance benefits from this conversion until we > remove the tree spinlock, right? The original intent is helping the long tail case. RB tree has worse long tails than xarray. I expect it will help the page fault long tail even without removing the tree spinlock. Chris