On Sun, Jan 19, 2025 at 11:32:05AM +0100, Mateusz Guzik wrote: > Dumping processes with large allocated and mostly not-faulted areas is > very slow. > > Borrowing a test case from Tavian Barnes: > > int main(void) { > char *mem = mmap(NULL, 1ULL << 40, PROT_READ | PROT_WRITE, > MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); > printf("%p %m\n", mem); > if (mem != MAP_FAILED) { > mem[0] = 1; > } > abort(); > } > > That's 1TB of almost completely not-populated area. > > On my test box it takes 13-14 seconds to dump. > > The profile shows: > - 99.89% 0.00% a.out > entry_SYSCALL_64_after_hwframe > do_syscall_64 > syscall_exit_to_user_mode > arch_do_signal_or_restart > - get_signal > - 99.89% do_coredump > - 99.88% elf_core_dump > - dump_user_range > - 98.12% get_dump_page > - 64.19% __get_user_pages > - 40.92% gup_vma_lookup > - find_vma > - mt_find > 4.21% __rcu_read_lock > 1.33% __rcu_read_unlock > - 3.14% check_vma_flags > 0.68% vma_is_secretmem > 0.61% __cond_resched > 0.60% vma_pgtable_walk_end > 0.59% vma_pgtable_walk_begin > 0.58% no_page_table > - 15.13% down_read_killable > 0.69% __cond_resched > 13.84% up_read > 0.58% __cond_resched > > Almost 29% of the time is spent relocking the mmap semaphore between > calls to get_dump_page() which find nothing. > > Whacking that results in times of 10 seconds (down from 13-14). > > While here make the thing killable. > > The real problem is the page-sized iteration and the real fix would > patch it up instead. It is left as an exercise for the mm-familiar > reader. > > Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx> > --- Seems like a good improvement to me. Let's get it tested.