Hi Yosry and Chris, On 2024/1/18 14:39, Yosry Ahmed wrote: > On Wed, Jan 17, 2024 at 10:01 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: >> >> That's a long CC list for sure :) >> >> On Wed, Jan 17, 2024 at 7:06 PM Chris Li <chrisl@xxxxxxxxxx> wrote: >>> >>> The RB tree shows some contribution to the swap fault >>> long tail latency due to two factors: >>> 1) RB tree requires re-balance from time to time. >>> 2) The zswap RB tree has a tree level spin lock protecting >>> the tree access. >>> >>> The swap cache is using xarray. The break down the swap >>> cache access does not have the similar long time as zswap >>> RB tree. >> >> I think the comparison to the swap cache may not be valid as the swap >> cache has many trees per swapfile, while zswap has a single tree. >> >>> >>> Moving the zswap entry to xarray enable read side >>> take read RCU lock only. >> >> Nice. >> >>> >>> The first patch adds the xarray alongside the RB tree. >>> There is some debug check asserting the xarray agrees with >>> the RB tree results. >>> >>> The second patch removes the zwap RB tree. >> >> The breakdown looks like something that would be a development step, >> but for patch submission I think it makes more sense to have a single >> patch replacing the rbtree with an xarray. >> >>> >>> I expect to merge the zswap rb tree spin lock with the xarray >>> lock in the follow up changes. >> >> Shouldn't this simply be changing uses of tree->lock to use >> xa_{lock/unlock}? We also need to make sure we don't try to lock the >> tree when operating on the xarray if the caller is already holding the >> lock, but this seems to be straightforward enough to be done as part >> of this patch or this series at least. >> >> Am I missing something? > > Also, I assume we will only see performance improvements after the > tree lock in its current form is removed so that we get loads > protected only by RCU. Can we get some performance numbers to see how > the latency improves with the xarray under contention (unless > Chengming is already planning on testing this for his multi-tree > patches). I just give it a try, the same test of kernel build in tmpfs with zswap shrinker enabled, all based on the latest mm/mm-stable branch. mm-stable zswap-split-tree zswap-xarray real 1m10.442s 1m4.157s 1m9.962s user 17m48.232s 17m41.477s 17m45.887s sys 8m13.517s 5m2.226s 7m59.305s Looks like the contention of concurrency is still there, I haven't look into the code yet, will review it later.