On Wed, Apr 03, 2024 at 10:06:16PM +0000, yueyang.pan@xxxxxxx wrote: > Dear Matthew, > I am Yueyang Pan a PhD student from EPFL, and I am currently > checking the swap code in the kernel. Sorry to bother you because this > email should go to Mel Gorman. He did not reply to me so I turned to > you for some help. Hi Yueyang, You'd probably have more luck if you cc'd the mailing list. Somebody other than Mel might have answered you. Added it now. > 1) I have some questions about `try_to_unmap_flush_dirty`. I > wonder why this is necessary in shrink_folio_list because in > the folio_check_references, we have already checked that the PTEs > pointing to this page does not have any access bit set. The current > shrink_folio_list then unmaps the page, clears the dirty bit, issues > the TLB flush if the dirty bit was set previously and then starts > to write the page to the swap. I wonder why here we cannot take an > opportunistic approach. My understanding is that if we don’t unmap the > page and perform flush, when there is a concurrent write to the page, > both the access bit and dirty bit will be set (because the dirty bit > is cleared) so we can simply check the access bit again after pageout > to see whether we can free this page or not. > I checked the git blame and saw Mel's commit in 2015 where he > mentioned that it was better to assume a writeable entry exist > in TLB but I wonder why this can be true if we have already use > folio_check_references to check the PTE access bit. Does this imply > even if the folio_check_references gives no reference there can be > still entries in the TLB? This is far outside my realm of expertise. I suspect it's possible that there can be stale entries in the TLB if ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH. > 2) I have some questions at the end of shrink_folio_list before the > ref_folios are spliced again back to folio_list. What if at the same > time, there is another function trying to free the page or mlock > the page? Will this page be circulated again in the inactive LRU > list and being double freed since the page lock was released at the > list_splice? Because from what I understood, the mlock will take the > page directly from the inactive/active list the page is in and move > the page to the mlock list but at this moment the page does not reside > in either of the list. I think the answer is that these folios have their LRU flag cleared throughout shrink_folio_list() so they cannot be mlocked? static struct lruvec *__mlock_folio(struct folio *folio, struct lruvec *lruvec) { /* There is nothing more we can do while it's off LRU */ if (!folio_test_clear_lru(folio)) return lruvec;