On Wed, 4 Dec 2024 19:09:40 +0800 Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx> wrote: > > ... > > Previously, we tried to use a completely asynchronous method to reclaim empty > user PTE pages [1]. After discussing with David Hildenbrand, we decided to > implement synchronous reclaimation in the case of madvise(MADV_DONTNEED) as the > first step. Please help us understand what the other steps are. Because we dont want to commit to a particular partial implementation only to later discover that completing that implementation causes us problems. > So this series aims to synchronously free the empty PTE pages in > madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in > zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than > madvise(MADV_DONTNEED). > > In zap_pte_range(), mmu_gather is used to perform batch tlb flushing and page > freeing operations. Therefore, if we want to free the empty PTE page in this > path, the most natural way is to add it to mmu_gather as well. Now, if > CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, mmu_gather will free page table > pages by semi RCU: > > - batch table freeing: asynchronous free by RCU > - single table freeing: IPI + synchronous free > > But this is not enough to free the empty PTE page table pages in paths other > that munmap and exit_mmap path, because IPI cannot be synchronized with > rcu_read_lock() in pte_offset_map{_lock}(). So we should let single table also > be freed by RCU like batch table freeing. > > As a first step, we supported this feature on x86_64 and selectd the newly > introduced CONFIG_ARCH_SUPPORTS_PT_RECLAIM. > > For other cases such as madvise(MADV_FREE), consider scanning and freeing empty > PTE pages asynchronously in the future. Handling MADV_FREE sounds fairly straightforward?