Chris Li <chrisl@xxxxxxxxxx> writes: > On Sat, Apr 27, 2024 at 6:16 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote: >> >> Chris Li <chrisl@xxxxxxxxxx> writes: >> >> > Hi Ying, >> > >> > For the swap file usage, I have been considering an idea to remove the >> > index part of the xarray from swap cache. Swap cache is different from >> > file cache in a few aspects. >> > For one if we want to have a folio equivalent of "large swap entry". >> > Then the natural alignment of those swap offset on does not make >> > sense. Ideally we should be able to write the folio to un-aligned swap >> > file locations. >> > >> > The other aspect for swap files is that, we already have different >> > data structures organized around swap offset, swap_map and >> > swap_cgroup. If we group the swap related data structure together. We >> > can add a pointer to a union of folio or a shadow swap entry. >> >> The shadow swap entry may be freed. So we need to prepare for that. > > Free the shadow swap entry will just set the pointer to NULL. > Are you concerned that the memory allocated for the pointer is not > free to the system after the shadow swap entry is free? > > It will be subject to fragmentation on the free swap entry. > In that regard, xarray is also subject to fragmentation. It will not > free the internal node if the node has one xa_index not freed. Even if > the xarray node is freed to slab, at slab level there is fragmentation > as well, the backing page might not free to the system. Sorry my words were confusing. What I wanted to say is that the xarray node may be freed. >> And, in current design, only swap_map[] is allocated if the swap space >> isn't used. That needs to be considered too. > > I am aware of that. I want to make the swap_map[] not static allocated > any more either. Yes. That's possible. > The swap_map static allocation forces the rest of the swap data > structure to have other means to sparsely allocate their data > structure, repeating the fragmentation elsewhere, in different > ways.That is also the one major source of the pain point hacking on > the swap code. The data structure is spread into too many different > places. Look forward to more details to compare :-) >> > We can use atomic updates on the swap struct member or breakdown the >> > access lock by ranges just like swap cluster does. >> >> The swap code uses xarray in a simple way. That gives us opportunity to >> optimize. For example, it makes it easy to use multiple xarray > > The fixed swap offset range makes it like an array. There are many > ways to shard the array like swap entry, e.g. swap cluster is one way > to shard it. Multiple xarray is another way. We can also do multiple > xarray like sharding, or even more fancy ones. -- Best Regards, Huang, Ying