Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Optimizing Page Cache Readahead Behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 25, 2025 at 10:56:21AM +1100, Dave Chinner wrote:
> > From the previous discussions that Matthew shared [7], it seems like
> > Dave proposed an alternative to moving the extents to the VFS layer to
> > invert the IO read path operations [8]. Maybe this is a move
> > approachable solution since there is precedence for the same in the
> > write path?
> > 
> > [7] https://lore.kernel.org/linux-fsdevel/Zs97qHI-wA1a53Mm@xxxxxxxxxxxxxxxxxxxx/
> > [8] https://lore.kernel.org/linux-fsdevel/ZtAPsMcc3IC1VaAF@xxxxxxxxxxxxxxxxxxx/
> 
> Yes, if we are going to optimise away redundant zeros being stored
> in the page cache over holes, we need to know where the holes in the
> file are before the page cache is populated.

Well, you shot that down when I started trying to flesh it out:
https://lore.kernel.org/linux-fsdevel/Zs+2u3%2FUsoaUHuid@xxxxxxxxxxxxxxxxxxx/

> As for efficient hole tracking in the mapping tree, I suspect that
> we should be looking at using exceptional entries in the mapping
> tree for holes, not inserting mulitple references to the zero folio.
> i.e. the important information for data storage optimisation is that
> the region covers a hole, not that it contains zeros.

The xarray is very much optimised for storing power-of-two sized &
aligned objects.  It makes no sense to try to track extents using the
mapping tree.  Now, if we abandon the radix tree for the maple tree, we
could talk about storing zero extents in the same data structure.
But that's a big change with potentially significant downsides.
It's something I want to play with, but I'm a little busy right now.

> For buffered reads, all that is required when such an exceptional
> entry is returned is a memset of the user buffer. For buffered
> writes, we simply treat it like a normal folio allocating write and
> replace the exceptional entry with the allocated (and zeroed) folio.

... and unmap the zero page from any mappings.

> For read page faults, the zero page gets mapped (and maybe
> accounted) via the vma rather than the mapping tree entry. For write
> faults, a folio gets allocated and the exception entry replaced
> before we call into ->page_mkwrite().
> 
> Invalidation simply removes the exceptional entries.

... and unmap the zero page from any mappings.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux