On Thu, May 30, 2024 at 10:54 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > > Chris Li <chrisl@xxxxxxxxxx> writes: > > > Hi Ying, > > > > On Wed, May 29, 2024 at 1:57 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote: > >> > >> Chris Li <chrisl@xxxxxxxxxx> writes: > >> > >> > I am spinning a new version for this series to address two issues > >> > found in this series: > >> > > >> > 1) Oppo discovered a bug in the following line: > >> > + ci = si->cluster_info + tmp; > >> > Should be "tmp / SWAPFILE_CLUSTER" instead of "tmp". > >> > That is a serious bug but trivial to fix. > >> > > >> > 2) order 0 allocation currently blindly scans swap_map disregarding > >> > the cluster->order. > >> > >> IIUC, now, we only scan swap_map[] only if > >> !list_empty(&si->free_clusters) && !list_empty(&si->nonfull_clusters[order]). > >> That is, if you doesn't run low swap free space, you will not do that. > > > > You can still swap space in order 0 clusters while order 4 runs out of > > free_cluster > > or nonfull_clusters[order]. For Android that is a common case. > > When we fail to allocate order 4, we will fallback to order 0. Still > don't need to scan swap_map[]. But after looking at your below reply, I > realized that the swap space is almost full at most times in your cases. > Then, it's possible that we run into scanning swap_map[]. > list_empty(&si->free_clusters) && > list_empty(&si->nonfull_clusters[order]) will become true, if we put too > many clusters in si->percpu_cluster. So, if we want to avoid to scan > swap_map[], we can stop add clusters in si->percpu_cluster when swap > space runs low. And maybe take clusters out of si->percpu_cluster > sometimes. Stop adding when it runs low seems too late, there could still be a free cluster stuck on a CPU, and not getting scanned, right? > Another issue is nonfull_cluster[order1] cannot be used for > nonfull_cluster[order2]. In definition, we should not fail order 0 > allocation, we need to steal nonfull_cluster[order>0] for order 0 > allocation. This can avoid to scan swap_map[] too. This may be not > perfect, but it is the simplest first step implementation. You can > optimize based on it further. This can be extended to allow any order < MAX_ORDER to steal from higher order, which might increase fragmentation though. So this is looking more and more like a buddy allocator, and that should be the long term solution.