Hi all, When doing random buffer write testing I found the bandwidth on EXT4 is much better than XFS under the same environment. The test case ,test result and test environment is as follows: Test case: fio --ioengine=sync --rw=randwrite --iodepth=64 --size=4G --name=test --filename=/mnt/testfile --bs=4k Before doing fio, use dd (if=/dev/zero of=/mnt/testfile bs=1M count=4096) to warm-up the file in the page cache. Test result (bandwidth): ext4 xfs ~300MB/s ~120MB/s Test environment: Platform: arm64 Kernel: v5.7 PAGESIZE: 64K Memtotal: 16G Storage: sata ssd(Max bandwidth about 350MB/s) FS block size: 4K The fio "Test result" shows that EXT4 has more than 2x bandwidth compared to XFS, but iostat shows the transfer speed of XFS to SSD is about 300MB/s too. So I debt XFS writing back many non-dirty blocks to SSD while writing back dirty pages. I tried to read the core writeback code of both filesystem and found XFS will write back blocks which is uptodate (seeing iomap_writepage_map()), while EXT4 writes back blocks which must be dirty (seeing ext4_bio_write_page() ) . XFS had turned from buffer head to iomap since V4.8, there is only a bitmap in iomap to track block's uptodate status, no 'dirty' concept was found, my question is if this is the reason why XFS writes many extra blocks to SSD when doing random buffer write? If it is, then why don't we track the dirty status of blocks in XFS? With the questions in brain, I start digging into XFS's history, and found a annotations in V2.6.12: /* * Calling this without startio set means we are being asked to make a dirty * page ready for freeing it's buffers. When called with startio set then * we are coming from writepage. * When called with startio set it is important that we write the WHOLE * page if possible. * The bh->b_state's cannot know if any of the blocks or which block for * that matter are dirty due to mmap writes, and therefore bh uptodate is * only vaild if the page itself isn't completely uptodate. Some layers * may clear the page dirty flag prior to calling write page, under the * assumption the entire page will be written out; by not writing out the * whole page the page can be reused before all valid dirty data is * written out. Note: in the case of a page that has been dirty'd by * mapwrite and but partially setup by block_prepare_write the * bh->b_states's will not agree and only ones setup by BPW/BCW will * have valid state, thus the whole page must be written out thing. */ STATIC int xfs_page_state_convert() >From above annotations, It seems this has something to do with mmap, but I can't get the point , so I turn to you guys to get the help. Anyway, I don't think there is such a difference about random write between XFS and EXT4. Any reply would be appreciative, Thanks in advance.