Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Improving large folio writeback performance

Jan Kara <jack@xxxxxxx> · Fri, 17 Jan 2025 12:56:52 +0100

On Fri 17-01-25 12:40:15, Vlastimil Babka wrote:
> On 1/15/25 01:50, Joanne Koong wrote:
> > Hi all,
> > 
> > I would like to propose a discussion topic about improving large folio
> > writeback performance. As more filesystems adopt large folios, it
> > becomes increasingly important that writeback is made to be as
> > performant as possible. There are two areas I'd like to discuss:
> > 
> > 
> > == Granularity of dirty pages writeback ==
> > Currently, the granularity of writeback is at the folio level. If one
> > byte in a folio is dirty, the entire folio will be written back. This
> > becomes unscalable for larger folios and significantly degrades
> > performance, especially for workloads that employ random writes.
> > 
> > One idea is to track dirty pages at a smaller granularity using a
> > 64-bit bitmap stored inside the folio struct where each bit tracks a
> > smaller chunk of pages (eg for 2 MB folios, each bit would track 32k
> > pages), and only write back dirty chunks rather than the entire folio.
> 
> I think this might be tricky in some cases? I.e. with 2 MB and pmd-mapped
> folio, it's possible to write-protect only the whole pmd, not individual 32k
> chunks in order to catch the first write to a chunk to mark it dirty.

Definitely. Once you map a folio through PMD entry, you have no other
option than consider whole 2MB dirty. But with PTE mappings or
modifications through syscalls you can do more fine-grained dirtiness
tracking and there're enough cases like that that it pays off.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR