I test this batch of patch with fio, it indeed has a huge sppedup in sequential read when block size is 4KiB. The result as follow, for async read, iodepth is set to 128, and other settings are self-evident. casename upstream withFix speedup ---------------- -------- -------- ------- randread-4k-sync 48991 47773 -2.4862% seqread-4k-sync 1162758 1422955 22.3776% seqread-1024k-sync 1460208 1452522 -0.5264% randread-4k-libaio 47467 47309 -0.3329% randread-4k-posixaio 49190 49512 0.6546% seqread-4k-libaio 1085932 1234635 13.6936% seqread-1024k-libaio 1423341 1402214 -1.4843% seqread-4k-posixaio 1165084 1369613 17.5549% seqread-1024k-posixaio 1435422 1408808 -1.8541%