Thank you, Matthew, for your reply. What do you think about the complexity of this task? I'd be interested in taking a look but I don't have kernel development experience so I would need guidance. On Thu, 20 Feb 2025 at 14:47, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Thu, Feb 20, 2025 at 01:48:18PM +0100, David Frank wrote: > > I'd like to efficiently mmap a large sparse file (ext4), 95% of which > > is holes. I was unsatisfied with the performance and after profiling, > > I found that most of the time is spent in filemap_add_folio and > > filemap_alloc_folio - much more than in my algorithm: > > > > - 97.87% filemap_fault > > - 97.57% do_sync_mmap_readahead > > - page_cache_ra_order > > - 97.28% page_cache_ra_unbounded > > - 40.80% filemap_add_folio > > + 21.93% __filemap_add_folio > > + 8.88% folio_add_lru > > + 7.56% workingset_refault > > + 28.73% filemap_alloc_folio > > + 22.34% read_pages > > + 3.29% xa_load > > Yes, this is expected. > > The fundamental problem is that we don't have the sparseness information > at the right point. So the read request (or pagefault) comes in, the > VFS allocates a page, puts it in the pagecache, then asks the filesystem > to fill it. The filesystem knows, so could theoretically tell the VFS > "Oh, this is a hole", but by this point the "damage" is done -- the page > has been allocated and added to the page cache. > > Of course, this is a soluble problem. The VFS could ask the filesystem > for its sparseness information (as you do in userspace), but unlike your > particular usecase, the kernel must handle attackers who are trying to > make it do the wrong thing as well as ill-timed writes. So the VFS has > to ensure it does not use stale data from the filesystem. > > This is a problem I'm somewhat interested in solving, but I'm a bit > busy with folios right now. And once that project is done, improving > the page cache for reflinked files is next on my list, so I'm not likely > to get to this problem for a few years. >