Hi all, I'd like to efficiently mmap a large sparse file (ext4), 95% of which is holes. I was unsatisfied with the performance and after profiling, I found that most of the time is spent in filemap_add_folio and filemap_alloc_folio - much more than in my algorithm: - 97.87% filemap_fault - 97.57% do_sync_mmap_readahead - page_cache_ra_order - 97.28% page_cache_ra_unbounded - 40.80% filemap_add_folio + 21.93% __filemap_add_folio + 8.88% folio_add_lru + 7.56% workingset_refault + 28.73% filemap_alloc_folio + 22.34% read_pages + 3.29% xa_load As a workaround, I started using lseek and SEEK_HOLE+SEEK_DATA and changed the algorithm to use a static array filled with zeros instead of reading from the holes. This works ~30x faster, however, it introduces substantial complexity in the implementation. I was wondering if mapping holes to zero pages with COW in the kernel is being considered. I found [a related thread][1] from early 2022 which mentions mapping to zero pages for shared memory objects. There seemed to be some concerns about the complexity, I wonder if it's different for (even just private/readonly) mmap. [1]: https://lore.kernel.org/lkml/4b1885b8-eb95-c50-2965-11e7c8efbf36@xxxxxxxxxx/T/ Thanks, David