Re: Performance Difference between ext4 and Raw Block Device Access with buffer_io

Ming Lei <ming.lei@xxxxxxxxxx> · Wed, 15 Nov 2023 17:19:28 +0800

On Mon, Nov 13, 2023 at 05:57:52PM -0800, Ming Lin wrote:
> Hi,
> 
> We are currently conducting performance tests on an application that
> involves writing/reading data to/from ext4 or a raw block device.
> Specifically, for raw block device access, we have implemented a
> simple "userspace filesystem" directly on top of it.
> 
> All write/read operations are being tested using buffer_io. However,
> we have observed that the ext4+buffer_io performance significantly
> outperforms raw_block_device+buffer_io:
> 
> ext4: write 18G/s, read 40G/s
> raw block device: write 18G/s, read 21G/s

Can you share your exact test case?

I tried the following fio test on both ext4 over nvme and raw nvme, and the
result is the opposite: raw block device throughput is 2X ext4, and it
can be observed in both VM and read hardware.

1) raw NVMe

fio --direct=0 --size=128G --bs=64k --runtime=20 --numjobs=8 --ioengine=psync \
    --group_reporting=1 --filename=/dev/nvme0n1 --name=test-read --rw=read

2) ext4

fio --size=1G --time_based --bs=4k --runtime=20 --numjobs=8 \
	--ioengine=psync --directory=$DIR --group_reporting=1 \
	--unlink=0 --direct=0 --fsync=0 --name=f1 --stonewall --rw=read

> 
> We are exploring potential reasons for this difference. One hypothesis
> is related to the page cache radix tree being per inode. Could it be
> that, for the raw_block_device, there is only one radix tree, leading
> to increased lock contention during write/read buffer_io operations?

'perf record/report' should show the hot spot if lock contention is the
reason.

Thanks,
Ming