On 5/29/24 15:45, Pankaj Raghav (Samsung) wrote:
From: Pankaj Raghav <p.raghav@xxxxxxxxxxx> iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav <p.raghav@xxxxxxxxxxx> --- After disucssing a bit in LSFMM about this, it was clear that using a PMD sized zero folio might not be a good idea[0], especially in platforms with 64k base page size, the huge zero folio can be as high as 512M just for zeroing small block sizes in the direct IO path. The idea to use iomap_init to allocate 64k zero buffer was suggested by Dave Chinner as it gives decent tradeoff between memory usage and efficiency. This is a good enough solution for now as moving beyond 64k block size in XFS might take a while. We can work on a more generic solution in the future to offer different sized zero folio that can go beyond 64k. [0] https://lore.kernel.org/linux-fsdevel/ZkdcAsENj2mBHh91@xxxxxxxxxxxxxxxxxxxx/
Reviewed-by: Hannes Reinecke <hare@xxxxxxx> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@xxxxxxx +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich