Re: [PATCH] mm/gup: don't check page lru flag before draining it

yangge1116 <yangge1116@xxxxxxx> · Thu, 6 Jun 2024 09:35:35 +0800

在 2024/6/5 下午5:53, David Hildenbrand 写道:
On 05.06.24 11:41, David Hildenbrand wrote:
On 05.06.24 03:18, yangge1116 wrote:


在 2024/6/4 下午9:47, David Hildenbrand 写道:
On 04.06.24 12:48, yangge1116@xxxxxxx wrote:
From: yangge <yangge1116@xxxxxxx>

If a page is added in pagevec, its ref count increases one, remove
the page from pagevec decreases one. Page migration requires the
page is not referenced by others except page mapping. Before
migrating a page, we should try to drain the page from pagevec in
case the page is in it, however, folio_test_lru() is not sufficient
to tell whether the page is in pagevec or not, if the page is in
pagevec, the migration will fail.

Remove the condition and drain lru once to ensure the page is not
referenced by pagevec.

What you are saying is that we might have a page on which
folio_test_lru() succeeds, that was added to one of the cpu_fbatches,
correct?

Yes


Can you describe under which circumstances that happens?


If we call folio_activate() to move a page from inactive LRU list to
active LRU list, the page is not only in LRU list, but also in one of
the cpu_fbatches.

void folio_activate(struct folio *folio)
{
       if (folio_test_lru(folio) && !folio_test_active(folio) &&
           !folio_test_unevictable(folio)) {
           struct folio_batch *fbatch;

           folio_get(folio);
           //After this, folio is in LRU list, and its ref count have
increased one.

           local_lock(&cpu_fbatches.lock);
           fbatch = this_cpu_ptr(&cpu_fbatches.activate);
           folio_batch_add_and_move(fbatch, folio, folio_activate_fn);
           local_unlock(&cpu_fbatches.lock);
       }
}

Interesting, the !SMP variant does the folio_test_clear_lru().

It would be really helpful if we could reliably identify whether LRU
batching code has a raised reference on a folio.

We have the same scenario in
* folio_deactivate()
* folio_mark_lazyfree()

In folio_batch_move_lru() we do the folio_test_clear_lru(folio).

No expert on that code, I'm wondering if we could move the
folio_test_clear_lru() out, such that we can more reliably identify
whether a folio is on the LRU batch or not.

I'm sure there would be something extremely broken with the following
(I don't know what I'm doing ;) ), but I wonder if there would be a way
to make something like that work (and perform well enough?).

diff --git a/mm/swap.c b/mm/swap.c
index 67786cb771305..642e471c3ec5a 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -212,10 +212,6 @@ static void folio_batch_move_lru(struct folio_batch 
*fbatch, move_fn_t move_fn)
         for (i = 0; i < folio_batch_count(fbatch); i++) {
                 struct folio *folio = fbatch->folios[i];

-               /* block memcg migration while the folio moves between 
lru */
-               if (move_fn != lru_add_fn && !folio_test_clear_lru(folio))
-                       continue;
-
                 folio_lruvec_relock_irqsave(folio, &lruvec, &flags);
                 move_fn(lruvec, folio);

@@ -255,8 +251,9 @@ static void lru_move_tail_fn(struct lruvec *lruvec, 
struct folio *folio)
   */
  void folio_rotate_reclaimable(struct folio *folio)
  {
-       if (!folio_test_locked(folio) && !folio_test_dirty(folio) &&
-           !folio_test_unevictable(folio) && folio_test_lru(folio)) {
+       if (folio_test_lru(folio) && !folio_test_locked(folio) &&
+           !folio_test_dirty(folio) && !folio_test_unevictable(folio) &&
+           folio_test_clear_lru(folio)) {
                 struct folio_batch *fbatch;
                 unsigned long flags;

@@ -354,7 +351,7 @@ static void folio_activate_drain(int cpu)
  void folio_activate(struct folio *folio)
  {
         if (folio_test_lru(folio) && !folio_test_active(folio) &&
-           !folio_test_unevictable(folio)) {
+           !folio_test_unevictable(folio) && 
folio_test_clear_lru(folio)) {
                 struct folio_batch *fbatch;

                 folio_get(folio);
@@ -699,6 +696,8 @@ void deactivate_file_folio(struct folio *folio)
         /* Deactivating an unevictable folio will not accelerate 
reclaim */
         if (folio_test_unevictable(folio))
                 return;
+       if (!folio_test_clear_lru(folio))
+               return;

         folio_get(folio);
         local_lock(&cpu_fbatches.lock);
@@ -718,7 +717,8 @@ void deactivate_file_folio(struct folio *folio)
  void folio_deactivate(struct folio *folio)
  {
         if (folio_test_lru(folio) && !folio_test_unevictable(folio) &&
-           (folio_test_active(folio) || lru_gen_enabled())) {
+           (folio_test_active(folio) || lru_gen_enabled()) &&
+           folio_test_clear_lru(folio)) {
                 struct folio_batch *fbatch;

                 folio_get(folio);
@@ -740,7 +740,8 @@ void folio_mark_lazyfree(struct folio *folio)
  {
         if (folio_test_lru(folio) && folio_test_anon(folio) &&
             folio_test_swapbacked(folio) && 
!folio_test_swapcache(folio) &&
-           !folio_test_unevictable(folio)) {
+           !folio_test_unevictable(folio) &&
+           folio_test_clear_lru(folio)) {
                 struct folio_batch *fbatch;

                 folio_get(folio);

With your changes, we will call folio_test_clear_lru(folio) firstly to 
clear the LRU flag, and then call folio_get(folio) to pin the folio, 
seems a little unreasonable. Normally, folio_get(folio) is called 
firstly to pin the page, and then some other functions is called to 
handle the folio.