Re: [PATCH 1/2] iomap: Support large pages

Christoph Hellwig <hch@xxxxxx> · Thu, 1 Aug 2019 18:21:47 +0200

On Wed, Jul 31, 2019 at 08:59:55PM -0700, Matthew Wilcox wrote:
> -       nbits = BITS_TO_LONGS(page_size(page) / SECTOR_SIZE);
> -       iop = kmalloc(struct_size(iop, uptodate, nbits),
> -                       GFP_NOFS | __GFP_NOFAIL);
> -       atomic_set(&iop->read_count, 0);
> -       atomic_set(&iop->write_count, 0);
> -       bitmap_zero(iop->uptodate, nbits);
> +       n = BITS_TO_LONGS(page_size(page) >> inode->i_blkbits);
> +       iop = kmalloc(struct_size(iop, uptodate, n),
> +                       GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO);

I am really worried about potential very large GFP_NOFS | __GFP_NOFAIL
allocations here.  And thinking about this a bit more while walking
at the beach I wonder if a better option is to just allocate one
iomap per tail page if needed rather than blowing the head page one
up.  We'd still always use the read_count and write_count in the
head page, but the bitmaps in the tail pages, which should be pretty
easily doable.

Note that we'll also need to do another optimization first that I
skipped in the initial iomap writeback path work:  We only really need
an iomap if the blocksize is smaller than the page and there actually
is an extent boundary inside that page.  If a (small or huge) page is
backed by a single extent we can skip the whole iomap thing.  That is at
least for now, because I have a series adding optional t10 protection
information tuples (8 bytes per 512 bytes of data) to the end of
the iomap, which would grow it quite a bit for the PI case, and would
make also allocating the updatodate bit dynamically uglies (but not
impossible).

Note that we'll also need to remove the line that limits the iomap
allocation size in iomap_begin to 1024 times the page size to a better
chance at contiguous allocations for huge page faults and generally
avoid pointless roundtrips to the allocator.  It might or might be
time to revisit that limit in general, not just for huge pages.