On Wed, Mar 27, 2024 at 2:49 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Kairui Song <ryncsn@xxxxxxxxx> writes: > > > On Wed, Mar 27, 2024 at 2:24 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> > >> Kairui Song <ryncsn@xxxxxxxxx> writes: > >> > >> > From: Kairui Song <kasong@xxxxxxxxxxx> > >> > > >> > Interestingly the major performance overhead of synchronous is actually > >> > from the workingset nodes update, that's because synchronous swap in > >> > >> If it's the major overhead, why not make it the first optimization? > > > > This performance issue became much more obvious after doing other > > optimizations, and other optimizations are for general swapin not only > > for synchronous swapin, that's also how I optimized things step by > > step, so I kept my patch order... > > > > And it is easier to do this after Patch 8/10 which introduces the new > > interface for swap cache. > > > >> > >> > keeps adding single folios into a xa_node, making the node no longer > >> > a shadow node and have to be removed from shadow_nodes, then remove > >> > the folio very shortly and making the node a shadow node again, > >> > so it has to add back to the shadow_nodes. > >> > >> The folio is removed only if should_try_to_free_swap() returns true? > >> > >> > Mark synchronous swapin folio with a special bit in swap entry embedded > >> > in folio->swap, as we still have some usable bits there. Skip workingset > >> > node update on insertion of such folio because it will be removed very > >> > quickly, and will trigger the update ensuring the workingset info is > >> > eventual consensus. > >> > >> Is this safe? Is it possible for the shadow node to be reclaimed after > >> the folio are added into node and before being removed? > > > > If a xa node contains any non-shadow entry, it can't be reclaimed, > > shadow_lru_isolate will check and skip such nodes in case of race. > > In shadow_lru_isolate(), > > /* > * The nodes should only contain one or more shadow entries, > * no pages, so we expect to be able to remove them all and > * delete and free the empty node afterwards. > */ > if (WARN_ON_ONCE(!node->nr_values)) > goto out_invalid; > if (WARN_ON_ONCE(node->count != node->nr_values)) > goto out_invalid; > > So, this isn't considered normal and will cause warning now. Yes, I added an exception in this patch: - if (WARN_ON_ONCE(node->count != node->nr_values)) + if (WARN_ON_ONCE(node->count != node->nr_values && mapping->host != NULL)) The code is not a good final solution, but the idea might not be that bad, list_lru provides many operations like LRU_ROTATE, we can even lazy remove all the nodes as a general optimization, or add a threshold for adding/removing a node from LRU. > > >> > >> If so, we may consider some other methods. Make shadow_nodes per-cpu? > > > > That's also an alternative solution if there are other risks. > > This appears a general optimization and more clean. I'm not sure if synchronization between CPUs will make more burden, because shadow nodes are globally shared, one node can be referenced by multiple CPUs, I can have a try to see if this is doable. Maybe a per-cpu batch is better but synchronization might still be an issue.