On Fri 17-01-25 12:40:15, Vlastimil Babka wrote: > On 1/15/25 01:50, Joanne Koong wrote: > > Hi all, > > > > I would like to propose a discussion topic about improving large folio > > writeback performance. As more filesystems adopt large folios, it > > becomes increasingly important that writeback is made to be as > > performant as possible. There are two areas I'd like to discuss: > > > > > > == Granularity of dirty pages writeback == > > Currently, the granularity of writeback is at the folio level. If one > > byte in a folio is dirty, the entire folio will be written back. This > > becomes unscalable for larger folios and significantly degrades > > performance, especially for workloads that employ random writes. > > > > One idea is to track dirty pages at a smaller granularity using a > > 64-bit bitmap stored inside the folio struct where each bit tracks a > > smaller chunk of pages (eg for 2 MB folios, each bit would track 32k > > pages), and only write back dirty chunks rather than the entire folio. > > I think this might be tricky in some cases? I.e. with 2 MB and pmd-mapped > folio, it's possible to write-protect only the whole pmd, not individual 32k > chunks in order to catch the first write to a chunk to mark it dirty. Definitely. Once you map a folio through PMD entry, you have no other option than consider whole 2MB dirty. But with PTE mappings or modifications through syscalls you can do more fine-grained dirtiness tracking and there're enough cases like that that it pays off. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR