On Wed, Jul 31, 2019 at 08:59:55PM -0700, Matthew Wilcox wrote: > - nbits = BITS_TO_LONGS(page_size(page) / SECTOR_SIZE); > - iop = kmalloc(struct_size(iop, uptodate, nbits), > - GFP_NOFS | __GFP_NOFAIL); > - atomic_set(&iop->read_count, 0); > - atomic_set(&iop->write_count, 0); > - bitmap_zero(iop->uptodate, nbits); > + n = BITS_TO_LONGS(page_size(page) >> inode->i_blkbits); > + iop = kmalloc(struct_size(iop, uptodate, n), > + GFP_NOFS | __GFP_NOFAIL | __GFP_ZERO); I am really worried about potential very large GFP_NOFS | __GFP_NOFAIL allocations here. And thinking about this a bit more while walking at the beach I wonder if a better option is to just allocate one iomap per tail page if needed rather than blowing the head page one up. We'd still always use the read_count and write_count in the head page, but the bitmaps in the tail pages, which should be pretty easily doable. Note that we'll also need to do another optimization first that I skipped in the initial iomap writeback path work: We only really need an iomap if the blocksize is smaller than the page and there actually is an extent boundary inside that page. If a (small or huge) page is backed by a single extent we can skip the whole iomap thing. That is at least for now, because I have a series adding optional t10 protection information tuples (8 bytes per 512 bytes of data) to the end of the iomap, which would grow it quite a bit for the PI case, and would make also allocating the updatodate bit dynamically uglies (but not impossible). Note that we'll also need to remove the line that limits the iomap allocation size in iomap_begin to 1024 times the page size to a better chance at contiguous allocations for huge page faults and generally avoid pointless roundtrips to the allocator. It might or might be time to revisit that limit in general, not just for huge pages.