Re: Fio on bcache device with O_SYNC on 4.15 kernel doesn't work as expected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




24.09.2018, 23:00, "Coly Li" <colyli@xxxxxxx>:
> On 9/21/18 4:51 PM, Захаров Алексей wrote:
>>  Hi all,
>>
>>  I've tested bcache on ubuntu 16.04 with hwe-edge(4.15) kernel with fio.
>>  While testing i found that fio with --sync=1 and libaio doesn't work as expected.
>>
>>  Here is an example:
>>  ~# uname -r
>>  4.15.0-34-generic
>>
>>  Bcache is in writeback mode.
>>  ~# cat /sys/class/block/bcache11/bcache/cache_mode
>>  writethrough [writeback] writearound none
>>
>>  First test, libaio and sync=0:
>>  fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=0 --ioengine=libaio
>>  test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
>>  fio-2.2.10
>>  Starting 1 process
>>  Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/61234KB/0KB /s] [0/15.4K/0 iops] [eta 00m:00s]
>>
>>  iostat -xt 1 result while testing:
>>  09/19/18 21:52:26
>>  avg-cpu: %user %nice %system %iowait %steal %idle
>>              0.22 0.00 1.35 0.00 0.00 98.43
>>
>>  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>>  sdv 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
>>  nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
>>  bcache11 0.00 0.00 0.00 15894.00 0.00 63576.00 8.00 3098732.18 0.03 0.00 0.03 0.03 52.40
>>
>>  Second test with libaio and sync=1:
>>  fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=libaio
>>  test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
>>  fio-2.2.10
>>  Starting 1 process
>>  Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/88123KB/0KB /s] [0/22.3K/0 iops] [eta 00m:00s]
>>
>>  iostat while testing:
>>  09/19/18 21:54:17
>>  avg-cpu: %user %nice %system %iowait %steal %idle
>>              0.19 0.00 1.16 0.00 0.00 98.65
>>
>>  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>>  sdv 0.00 0.00 0.00 1.00 0.00 4.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
>>  nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
>>  bcache11 0.00 0.00 0.00 22118.00 0.00 88472.00 8.00 1014565.97 0.04 0.00 0.04 0.01 16.40
>>
>>  Third test with fsync=1 and libaio:
>>  fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --fsync=1 --ioengine=libaio
>>  test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
>>  fio-2.2.10
>>  Starting 1 process
>>  Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/21280KB/0KB /s] [0/5320/0 iops] [eta 00m:00s]
>>
>>  iostat:
>>     09/19/18 21:56:52
>>  avg-cpu: %user %nice %system %iowait %steal %idle
>>              0.19 0.00 0.91 1.38 0.00 97.52
>>
>>  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>>  sdv 0.00 0.00 0.00 5959.00 0.00 4.00 0.00 0.00 0.09 0.00 0.09 0.00 0.00
>>  nvme0c33n1 0.00 0.00 2.00 0.00 8.00 0.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
>>  bcache11 0.00 0.00 0.00 11915.00 0.00 23832.00 4.00 1548362.98 0.06 0.00 0.06 0.02 23.20
>>
>>  Fourth test with sync=1 and posixaio:
>>  fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=posixaio
>>  test: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=1
>>  fio-2.2.10
>>  Starting 1 process
>>  Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/27080KB/0KB /s] [0/6770/0 iops] [eta 00m:00s]
>>
>>  iostat:
>>  09/19/18 21:59:50
>>  avg-cpu: %user %nice %system %iowait %steal %idle
>>              0.09 0.00 0.56 2.26 0.00 97.08
>>
>>  Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
>>  sdv 0.00 0.00 0.00 6605.00 0.00 4.00 0.00 0.00 0.09 0.00 0.09 0.00 0.00
>>  nvme0c33n1 0.00 1.00 2.00 3.00 8.00 12.50 8.20 0.00 0.00 0.00 0.00 0.00 0.00
>>  bcache11 0.00 0.00 0.00 13208.00 0.00 26416.00 4.00 838177.72 0.07 0.00 0.07 0.01 11.60
>>
>>  Results of last two test are understandable: we see write request and flush request on caching device and flush request on backing device per one fio write request. Fio iops are about 6K, because of slow backing device.
>>  But results of first two tests look a bit wierd to me: test with sync=1 shows more iops then test with sync=0. And there're no flush requests, when sync=1.
>>  I've tried to figure out if fio opens file with O_SYNC by running it in strace:
>>  strace -e 'open' fio --name=test --iodepth=1 --numjobs=1 --direct=1 --filename=/dev/bcache11 --filesize=1G --blocksize=4k --rw=randwrite --sync=1 --ioengine=libaio
>>  And i found that it is ok:
>>  open("/dev/bcache11", O_RDWR|O_SYNC|O_DIRECT|O_NOATIME) = 3
>>
>>  This behaviour is not reproducable on 4.4 kernel, which is a default for ubuntu 16.04.
>>  Btw, avgqu-sz show too high values for bcache on 4.15, even if no operations are in progress.
>>
>>  What could be the root of this behaviour? I thought that libaio might be the cause - it is not upgraded when 4.15 kernel is installed, but it's just a guess.
>>
>>  I can add full fio results, or make any additional tests or provide any other info if it helps.
>
> Hi Aleksei,
>
> We don't specially treat sync requests, in writeback mode request with
> REQ_SYNC still go into cache device.
>
> I cannot provide an answer with easy, but there are quite a lot of
> changes happened since 4.15 to 4.18. Could you please to try the latest
> stable kernel and check is there any difference ?

I did some more tests, reinstalled kernel packages. And i can't reproduce previously seen behaviour (fortunately).

As i can see, for 4.15 and 4.18 kernels:
--sync=1 --ioengine=libaio:
write + flush to cache device

--fsync=1 --ioengine=libaio:
write + flush to cache device
flush to backing device

for 4.4 kernel:
--sync=1 --ioengine=libaio:
write + flush to cache device
flush to backing device

--fsync=1 --ioengine=libaio:
write + flush to cache device
flush to backing device

Please, correct me, if i'm wrong:
Open file with O_SYNC flag causes REQ_SYNC flag to be set on every write io.
Send of flush request causes REQ_PREFLUSH flag set on flush io.

>
> BTW, could you please tell me why you care the performance of single
> thread and iodepth 0, I almost not test performance for such configuration.

I've been benchmarking bcache with nvme drives and i increased numjobs gradually.
Increasing "iodepth" settings leads to 100% cpu usage by 1 fio process.
It was ~50-60% cpu utilization with iodepth=1 and 100% with iodepth=4 on raw nvme device.
pidstat fio showed ~80%sys and ~20%usr. So, i've decided to set iodepth=1 and increase numjobs.


>
> Thanks.
>
> Coly Li
>
>>  --
>>  Regards,
>>  Aleksei Zakharov

-- 
Regards,
Aleksei Zakharov




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux