Johannes Weiner <hannes@xxxxxxxxxxx> 于 2024年10月12日周六 02:28写道: > > On Fri, Oct 11, 2024 at 10:53:31AM -0700, Yosry Ahmed wrote: > > On Fri, Oct 11, 2024 at 10:20 AM Kairui Song <ryncsn@xxxxxxxxx> wrote: > > > > > > From: Kairui Song <kasong@xxxxxxxxxxx> > > > > > > zswap_invalidation simply calls xa_erase, which acquires the Xarray > > > lock first, then does a look up. This has a higher overhead even if > > > zswap is not used or the tree is empty. > > > > > > So instead, do a very lightweight xa_empty check first, if there is > > > nothing to erase, don't touch the lock or the tree. > > Great idea! > > > XA_STATE(xas, ..); > > > > rcu_read_lock(); > > entry = xas_load(&xas); > > if (entry) { > > xas_lock(&xas); > > WARN_ON_ONCE(xas_reload(&xas) != entry); > > xas_store(&xas, NULL); > > xas_unlock(&xas); > > } > > rcu_read_unlock(): > > This does the optimization more reliably, and I think we should go > with this version. Hi Yosry and Johannes, This is a good idea. But xa_empty is just much lighweighter, it's just a inlined ( == NULL ) check, so unsurprising it has better performance than xas_load. And surprisingly it's faster than zswap_never_enabled. So I think it could be doable to introduce something like zswap_may_have_swpentry as Yosry suggested. So how about a combined version with xas_load and xa_empty? Check xa_empty first as a faster path, then xas_load, then xas_store. Here is the benchmark result (time of swapin 2G zero pages in us): Before: 1908944 1905870 1905322 1905627 1901667 xa_empty: 1835343 1827367 1828402 1831841 1832719 z.._enabled: 1838428 1831162 1838205 1837287 1840980 xas_load: 1874606 1878971 1870182 1875852 1873403 combined: 1845309 1832919 1831904 1836455 1842570 `combined` is xa_empty + xas_load.