Kairui Song <ryncsn@xxxxxxxxx> writes: > On Wed, Mar 27, 2024 at 2:24 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Kairui Song <ryncsn@xxxxxxxxx> writes: >> >> > From: Kairui Song <kasong@xxxxxxxxxxx> >> > >> > Interestingly the major performance overhead of synchronous is actually >> > from the workingset nodes update, that's because synchronous swap in >> >> If it's the major overhead, why not make it the first optimization? > > This performance issue became much more obvious after doing other > optimizations, and other optimizations are for general swapin not only > for synchronous swapin, that's also how I optimized things step by > step, so I kept my patch order... > > And it is easier to do this after Patch 8/10 which introduces the new > interface for swap cache. > >> >> > keeps adding single folios into a xa_node, making the node no longer >> > a shadow node and have to be removed from shadow_nodes, then remove >> > the folio very shortly and making the node a shadow node again, >> > so it has to add back to the shadow_nodes. >> >> The folio is removed only if should_try_to_free_swap() returns true? >> >> > Mark synchronous swapin folio with a special bit in swap entry embedded >> > in folio->swap, as we still have some usable bits there. Skip workingset >> > node update on insertion of such folio because it will be removed very >> > quickly, and will trigger the update ensuring the workingset info is >> > eventual consensus. >> >> Is this safe? Is it possible for the shadow node to be reclaimed after >> the folio are added into node and before being removed? > > If a xa node contains any non-shadow entry, it can't be reclaimed, > shadow_lru_isolate will check and skip such nodes in case of race. In shadow_lru_isolate(), /* * The nodes should only contain one or more shadow entries, * no pages, so we expect to be able to remove them all and * delete and free the empty node afterwards. */ if (WARN_ON_ONCE(!node->nr_values)) goto out_invalid; if (WARN_ON_ONCE(node->count != node->nr_values)) goto out_invalid; So, this isn't considered normal and will cause warning now. >> >> If so, we may consider some other methods. Make shadow_nodes per-cpu? > > That's also an alternative solution if there are other risks. This appears a general optimization and more clean. -- Best Regards, Huang, Ying