Re: [PATCH] mm: swap: mTHP frees entries as a whole

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2024/8/6 10:07, Barry Song 写道:
On Tue, Aug 6, 2024 at 2:01 PM zhiguojiang <justinjiang@xxxxxxxx> wrote:


在 2024/8/6 6:09, Barry Song 写道:
On Tue, Aug 6, 2024 at 4:08 AM Zhiguo Jiang <justinjiang@xxxxxxxx> wrote:
Support mTHP's attempt to free swap entries as a whole, which can avoid
frequent swap_info locking for every individual entry in
swapcache_free_entries(). When the swap_map count values corresponding
to all contiguous entries are all zero excluding SWAP_HAS_CACHE, the
entries will be freed directly by skippping percpu swp_slots caches.

No, this isn't quite good. Please review the work done by Chris and Kairui[1];
they have handled it better. On a different note, I have a patch that can
handle zap_pte_range() for swap entries in batches[2][3].
I'm glad to see your optimized submission about batch freeing swap
entries for
zap_pte_range(), sorry, I didn't see it before. My this patch can be
ignored.
no worries, please help test and review the formal patch I sent:
https://lore.kernel.org/linux-mm/20240806012409.61962-1-21cnbao@xxxxxxxxx/
I believe it's ok and valuable.  Looking forward to being merged soon.

Please note that I didn't use a bitmap to avoid a large stack, and
there is a real possibility of the below can occur, your patch can
crash if the below is true:
nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER

Additionally, I quickly skip the case where
swap_count(data_race(si->swap_map[start_offset]) != 1) to avoid regressions
in cases that can't be batched.

Thanks
Zhiguo

[1] https://lore.kernel.org/linux-mm/20240730-swap-allocator-v5-5-cb9c148b9297@xxxxxxxxxx/
[2] https://lore.kernel.org/linux-mm/20240803091118.84274-1-21cnbao@xxxxxxxxx/
[3] https://lore.kernel.org/linux-mm/CAGsJ_4wPnQqKOHx6iQcwO8bQzoBXKr2qY2AgSxMwTQCj3-8YWw@xxxxxxxxxxxxxx/

Signed-off-by: Zhiguo Jiang <justinjiang@xxxxxxxx>
---
   mm/swapfile.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
   1 file changed, 61 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index ea023fc25d08..829fb4cfb6ec
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1493,6 +1493,58 @@ static void swap_entry_range_free(struct swap_info_struct *p, swp_entry_t entry,
          swap_range_free(p, offset, nr_pages);
   }

+/*
+ * Free the contiguous swap entries as a whole, caller have to
+ * ensure all entries belong to the same folio.
+ */
+static void swap_entry_range_check_and_free(struct swap_info_struct *p,
+                                 swp_entry_t entry, int nr, bool *any_only_cache)
+{
+       const unsigned long start_offset = swp_offset(entry);
+       const unsigned long end_offset = start_offset + nr;
+       unsigned long offset;
+       DECLARE_BITMAP(to_free, SWAPFILE_CLUSTER) = { 0 };
+       struct swap_cluster_info *ci;
+       int i = 0, nr_setbits = 0;
+       unsigned char count;
+
+       /*
+        * Free and check swap_map count values corresponding to all contiguous
+        * entries in the whole folio range.
+        */
+       WARN_ON_ONCE(nr > SWAPFILE_CLUSTER);
+       ci = lock_cluster_or_swap_info(p, start_offset);
+       for (offset = start_offset; offset < end_offset; offset++, i++) {
+               if (data_race(p->swap_map[offset])) {
+                       count = __swap_entry_free_locked(p, offset, 1);
+                       if (!count) {
+                               bitmap_set(to_free, i, 1);
+                               nr_setbits++;
+                       } else if (count == SWAP_HAS_CACHE) {
+                               *any_only_cache = true;
+                       }
+               } else {
+                       WARN_ON_ONCE(1);
+               }
+       }
+       unlock_cluster_or_swap_info(p, ci);
+
+       /*
+        * If the swap_map count values corresponding to all contiguous entries are
+        * all zero excluding SWAP_HAS_CACHE, the entries will be freed directly by
+        * skippping percpu swp_slots caches, which can avoid frequent swap_info
+        * locking for every individual entry.
+        */
+       if (nr > 1 && nr_setbits == nr) {
+               spin_lock(&p->lock);
+               swap_entry_range_free(p, entry, nr);
+               spin_unlock(&p->lock);
+       } else {
+               for_each_set_bit(i, to_free, SWAPFILE_CLUSTER)
+                       free_swap_slot(swp_entry(p->type, start_offset + i));
+       }
+}
+
   static void cluster_swap_free_nr(struct swap_info_struct *sis,
                  unsigned long offset, int nr_pages,
                  unsigned char usage)
@@ -1808,6 +1860,14 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
          if (WARN_ON(end_offset > si->max))
                  goto out;

+       /*
+        * Try to free all contiguous entries about mTHP as a whole.
+        */
+       if (IS_ENABLED(CONFIG_THP_SWAP) && nr > 1) {
+               swap_entry_range_check_and_free(si, entry, nr, &any_only_cache);
+               goto free_cache;
+       }
+
          /*
           * First free all entries in the range.
           */
@@ -1821,6 +1881,7 @@ void free_swap_and_cache_nr(swp_entry_t entry, int nr)
                  }
          }

+free_cache:
          /*
           * Short-circuit the below loop if none of the entries had their
           * reference drop to zero.
--
2.39.0

Thanks
Barry
Thanks
Zhiguo




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux