Re: [PATCH v4 1/2] mm: store zero pages to be swapped out in a bitmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 13/06/2024 17:38, Yosry Ahmed wrote:
[..]
+    for (i = 0; i < SWAPFILE_CLUSTER; i++)
+            clear_bit(idx * SWAPFILE_CLUSTER + i, si->zeromap);
Could you explain why we need to clear the zeromap here?

swap_cluster_schedule_discard() is called from:
- swap_free_cluster() -> free_cluster()

This is already covered below.

- swap_entry_free() -> dec_cluster_info_page() -> free_cluster()

Each entry in the cluster should have its zeromap bit cleared in
swap_entry_free() before the entire cluster is free and we call
free_cluster().

Am I missing something?
Yes, it looks like this one is not needed as swap_entry_free and
swap_free_cluster would already have cleared the bit. Will remove it.

I had initially started checking what codepaths zeromap would need to be
cleared. But then thought I could do it wherever si->swap_map is cleared
or set to SWAP_MAP_BAD, which is why I added it here.

      cluster_list_add_tail(&si->discard_clusters, si->cluster_info, idx);

@@ -482,7 +491,7 @@ static void __free_cluster(struct swap_info_struct *si, unsigned long idx)
   static void swap_do_scheduled_discard(struct swap_info_struct *si)
   {
      struct swap_cluster_info *info, *ci;
-    unsigned int idx;
+    unsigned int idx, i;

      info = si->cluster_info;

@@ -498,6 +507,8 @@ static void swap_do_scheduled_discard(struct swap_info_struct *si)
              __free_cluster(si, idx);
              memset(si->swap_map + idx * SWAPFILE_CLUSTER,
                              0, SWAPFILE_CLUSTER);
+            for (i = 0; i < SWAPFILE_CLUSTER; i++)
+                    clear_bit(idx * SWAPFILE_CLUSTER + i, si->zeromap);
Same here. I didn't look into the specific code paths, but shouldn't the
cluster be unused (and hence its zeromap bits already cleared?).

I think this one is needed (or atleast very good to have). There are 2
paths:

1) swap_cluster_schedule_discard (clears zeromap) -> swap_discard_work
-> swap_do_scheduled_discard (clears zeromap)

Path 1 doesnt need it as swap_cluster_schedule_discard already clears it.

2) scan_swap_map_slots -> scan_swap_map_try_ssd_cluster ->
swap_do_scheduled_discard (clears zeromap)

Path 2 might need it as zeromap isnt cleared earlier I believe
(eventhough I think it might already be 0).
Aren't the clusters in the discard list free by definition? It seems
like we add a cluster there from swap_cluster_schedule_discard(),
which we establish above that it gets called on a free cluster, right?

You mean for path 2? Its not from swap_cluster_schedule_discard. The whole call path is

get_swap_pages -> scan_swap_map_slots -> scan_swap_map_try_ssd_cluster -> swap_do_scheduled_discard. Nowhere up until swap_do_scheduled_discard was the zeromap cleared, which is why I think we should add it here.


Even if its cleared in path 2, I think its good to keep this one, as the
function is swap_do_scheduled_discard, i.e. incase it gets directly
called or si->discard_work gets scheduled anywhere else in the future,
it should do as the function name suggests, i.e. swap discard(clear
zeromap).
I think we just set the swap map to SWAP_MAP_BAD in
swap_cluster_schedule_discard() and then clear it in
swap_do_scheduled_discard(), and the clusters are already freed at
that point. Ying could set me straight if I am wrong here.
I think you might be mixing up path 1 and path 2 above? swap_cluster_schedule_discard is not called in Path 2 where swap_do_scheduled_discard ends up being called, which is why I think we would need to clear the zeromap here.
It is confusing to me to keep an unnecessary call tbh, it makes sense
to clear zeromap bits once, when the swap entry/cluster is not being
used anymore and before it's reallocated.

              unlock_cluster(ci);
      }
   }
@@ -1059,9 +1070,12 @@ static void swap_free_cluster(struct swap_info_struct *si, unsigned long idx)
   {
      unsigned long offset = idx * SWAPFILE_CLUSTER;
      struct swap_cluster_info *ci;
+    unsigned int i;

      ci = lock_cluster(si, offset);
      memset(si->swap_map + offset, 0, SWAPFILE_CLUSTER);
+    for (i = 0; i < SWAPFILE_CLUSTER; i++)
+            clear_bit(offset + i, si->zeromap);
      cluster_set_count_flag(ci, 0, 0);
      free_cluster(si, idx);
      unlock_cluster(ci);
@@ -1336,6 +1350,7 @@ static void swap_entry_free(struct swap_info_struct *p, swp_entry_t entry)
      count = p->swap_map[offset];
      VM_BUG_ON(count != SWAP_HAS_CACHE);
      p->swap_map[offset] = 0;
+    clear_bit(offset, p->zeromap);
I think instead of clearing the zeromap in swap_free_cluster() and here
separately, we can just do it in swap_range_free(). I suspect this may
be the only place we really need to clear the zero in the swapfile code.
Sure, we could move it to swap_range_free, but then also move the
clearing of swap_map.

When it comes to clearing zeromap, I think its just generally a good
idea to clear it wherever swap_map is cleared.
I am not convinced about this argument. The swap_map is used for
multiple reasons beyond just keeping track of whether a swap entry is
in-use. The zeromap on the other hand is simpler and just needs to be
cleared once when an entry is being freed.

Unless others disagree, I prefer to only clear the zeromap once in
swap_range_free() and keep the swap_map code as-is for now. If we
think there is value in moving clearing the swap_map to
swap_range_free(), it should at least be in a separate patch to be
evaluated separately.

Just my 2c.

Sure, I am indifferent to this. I dont think it makes a difference if the zeromap is cleared in swap_free_cluster + swap_entry_free or later on in a common swap_range_free function, so will just move it in the next revision. Wont move swap_map clearing code.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux