Hi, I also don't understand why enabling O_SYNC would make things faster but the fact you can change your kernel version and see different behaviour really does point to a change in the kernel rather than an issue in fio. You might find that a blktrace while fio is running gives some insight into what is happening. The only other thing to be weary of is if you aren't resetting the cache between runs (later runs may be changing their behaviour based on the cache's state). You might find the bcache folks (their mailing list details are on http://vger.kernel.org/vger-lists.html#linux-bcache ) provide more assistance than we can but if you get to the bottom of this do let us know the outcome! On Wed, 19 Sep 2018 at 20:42, Захаров Алексей <zakharov.a.g@xxxxxxxxx> wrote: > > Thanks for the answer and sorry for this mess. > > Here is an example with 4.15 kernel: > ~# uname -r > 4.15.0-34-generic > > Bcache is in writeback mode. > ~# cat /sys/class/block/bcache11/bcache/cache_mode > writethrough [writeback] writearound none > > First test, libaio and sync=0: > fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=0 --ioengine=libaio > test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1 > fio-2.2.10 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/61234KB/0KB /s] [0/15.4K/0 iops] [eta 00m:00s] > > iostat -xt 1 result while testing: > 09/19/18 21:52:26 > avg-cpu: %user %nice %system %iowait %steal %idle > 0.22 0.00 1.35 0.00 0.00 98.43 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdv 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 > nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 > bcache11 0.00 0.00 0.00 15894.00 0.00 63576.00 8.00 3098732.18 0.03 0.00 0.03 0.03 52.40 > > > Second test with libaio and sync=1: > fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=libaio > test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1 > fio-2.2.10 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/88123KB/0KB /s] [0/22.3K/0 iops] [eta 00m:00s] > > iostat while testing: > 09/19/18 21:54:17 > avg-cpu: %user %nice %system %iowait %steal %idle > 0.19 0.00 1.16 0.00 0.00 98.65 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdv 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 > nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 > bcache11 0.00 0.00 0.00 22118.00 0.00 88472.00 8.00 1014565.97 0.04 0.00 0.04 0.01 16.40 > > > Third test with fsync=1 and libaio: > fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --fsync=1 --ioengine=libaio > test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1 > fio-2.2.10 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/21280KB/0KB /s] [0/5320/0 iops] [eta 00m:00s] > > iostat: > 09/19/18 21:56:52 > avg-cpu: %user %nice %system %iowait %steal %idle > 0.19 0.00 0.91 1.38 0.00 97.52 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdv 0.00 0.00 0.00 5959.00 0.00 4.00 0.00 0.00 0.09 0.00 0.09 0.00 0.00 > nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00 > bcache11 0.00 0.00 0.00 11915.00 0.00 23832.00 4.00 1548362.98 0.06 0.00 0.06 0.02 23.20 > > > Fourth test with sync=1 and posixaio: > fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=posixaio > test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=1 > fio-2.2.10 > Starting 1 process > Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/27080KB/0KB /s] [0/6770/0 iops] [eta 00m:00s] > > iostat: > 09/19/18 21:59:50 > avg-cpu: %user %nice %system %iowait %steal %idle > 0.09 0.00 0.56 2.26 0.00 97.08 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util > sdv 0.00 0.00 0.00 6605.00 0.00 4.00 0.00 0.00 0.09 0.00 0.09 0.00 0.00 > nvme0c33n1 0.00 1.00 2.00 3.00 8.00 12.50 8.20 0.00 0.00 0.00 0.00 0.00 0.00 > bcache11 0.00 0.00 0.00 13208.00 0.00 26416.00 4.00 838177.72 0.07 0.00 0.07 0.01 11.60 > > > Results of last two test are understandable: we see write request and flush request on caching device and flush request on backing device per one fio write request. Fio iops are about 6K, because of slow backing device. > But results of first two tests look a bit wierd to me: test with sync=1 shows more iops then test with sync=0. And there're no flush requests, when sync=1. > I've tried to figure out if fio opens file with O_SYNC by running it in strace: > strace -e 'open' fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=libaio > And i found that it is ok: > open("/dev/bcache11", O_RDWR|O_SYNC|O_DIRECT|O_NOATIME) = 3 > > This behaviour is not reproducable on 4.4 kernel, which is a default for ubuntu 16.04 (that's why i mentioned 4.4 kernel). > Btw, avgqu-sz show too high values for bcache, even if no operations are in progress. > > I can add full fio results, or make any additional tests or provide any other info if it helps. -- Sitsofe | http://sucs.org/~sits/