On Fri, Jun 02, 2023 at 11:24:44PM +0100, Matthew Wilcox (Oracle) wrote: > If we have a large folio, we can copy in larger chunks than PAGE_SIZE. > Start at the maximum page cache size and shrink by half every time we > hit the "we are short on memory" problem. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > fs/iomap/buffered-io.c | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index a10f9c037515..10434b07e0f9 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -768,6 +768,7 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len, > static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > { > loff_t length = iomap_length(iter); > + size_t chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER; > loff_t pos = iter->pos; > ssize_t written = 0; > long status = 0; > @@ -776,15 +777,13 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > > do { > struct folio *folio; > - struct page *page; > - unsigned long offset; /* Offset into pagecache page */ > - unsigned long bytes; /* Bytes to write to page */ > + size_t offset; /* Offset into folio */ > + unsigned long bytes; /* Bytes to write to folio */ > size_t copied; /* Bytes copied from user */ > > - offset = offset_in_page(pos); > - bytes = min_t(unsigned long, PAGE_SIZE - offset, > - iov_iter_count(i)); > again: > + offset = pos & (chunk - 1); > + bytes = min(chunk - offset, iov_iter_count(i)); > status = balance_dirty_pages_ratelimited_flags(mapping, > bdp_flags); > if (unlikely(status)) > @@ -814,11 +813,14 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > if (iter->iomap.flags & IOMAP_F_STALE) > break; > > - page = folio_file_page(folio, pos >> PAGE_SHIFT); > + offset = offset_in_folio(folio, pos); > + if (bytes > folio_size(folio) - offset) > + bytes = folio_size(folio) - offset; > + > if (mapping_writably_mapped(mapping)) > - flush_dcache_page(page); > + flush_dcache_folio(folio); > > - copied = copy_page_from_iter_atomic(page, offset, bytes, i); > + copied = copy_page_from_iter_atomic(&folio->page, offset, bytes, i); I think I've gotten lost in the weeds. Does copy_page_from_iter_atomic actually know how to deal with a multipage folio? AFAICT it takes a page, kmaps it, and copies @bytes starting at @offset in the page. If a caller feeds it a multipage folio, does that all work correctly? Or will the pagecache split multipage folios as needed to make it work right? If we create a 64k folio at pos 0 and then want to write a byte at pos 40k, does __filemap_get_folio break up the 64k folio so that the folio returned by iomap_get_folio starts at 40k? Or can the iter code handle jumping ten pages into a 16-page folio and I just can't see it? (Allergies suddenly went from 0 to 9, engage breaindead mode...) --D > > status = iomap_write_end(iter, pos, bytes, copied, folio); > > @@ -835,6 +837,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i) > */ > if (copied) > bytes = copied; > + if (chunk > PAGE_SIZE) > + chunk /= 2; > goto again; > } > pos += status; > -- > 2.39.2 >