> Il giorno 19 ago 2019, alle ore 18:41, Paolo Valente <paolo.valente@xxxxxxxxxx> ha scritto: > > > >> Il giorno 16 ago 2019, alle ore 20:17, Paolo Valente <paolo.valente@xxxxxxxxxx> ha scritto: >> >> >> >>> Il giorno 16 ago 2019, alle ore 19:59, Josef Bacik <josef@xxxxxxxxxxxxxx> ha scritto: >>> >>> On Fri, Aug 16, 2019 at 07:52:40PM +0200, Paolo Valente wrote: >>>> >>>> >>>>> Il giorno 16 ago 2019, alle ore 15:21, Josef Bacik <josef@xxxxxxxxxxxxxx> ha scritto: >>>>> >>>>> On Fri, Aug 16, 2019 at 12:57:41PM +0200, Paolo Valente wrote: >>>>>> Hi, >>>>>> I happened to test the io.latency controller, to make a comparison >>>>>> between this controller and BFQ. But io.latency seems not to work, >>>>>> i.e., not to reduce latency compared with what happens with no I/O >>>>>> control at all. Here is a summary of the results for one of the >>>>>> workloads I tested, on three different devices (latencies in ms): >>>>>> >>>>>> no I/O control io.latency BFQ >>>>>> NVMe SSD 1.9 1.9 0.07 >>>>>> SATA SSD 39 56 0.7 >>>>>> HDD 4500 4500 11 >>>>>> >>>>>> I have put all details on hardware, OS, scenarios and results in the >>>>>> attached pdf. For your convenience, I'm pasting the source file too. >>>>>> >>>>> >>>>> Do you have the fio jobs you use for this? >>>> >>>> The script mentioned in the draft (executed with the command line >>>> reported in the draft), executes one fio instance for the target >>>> process, and one fio instance for each interferer. I couldn't do with >>>> just one fio instance executing all jobs, because the weight parameter >>>> doesn't work in fio jobfiles for some reason, and because the ioprio >>>> class cannot be set for individual jobs. >>>> >>>> In particular, the script generates a job with the following >>>> parameters for the target process: >>>> >>>> ioengine=sync >>>> loops=10000 >>>> direct=0 >>>> readwrite=randread >>>> fdatasync=0 >>>> bs=4k >>>> thread=0 >>>> filename=/mnt/scsi_debug/largefile_interfered0 >>>> iodepth=1 >>>> numjobs=1 >>>> invalidate=1 >>>> >>>> and a job with the following parameters for each of the interferers, >>>> in case, e.g., of a workload made of reads: >>>> >>>> ioengine=sync >>>> direct=0 >>>> readwrite=read >>>> fdatasync=0 >>>> bs=4k >>>> filename=/mnt/scsi_debug/largefileX >>>> invalidate=1 >>>> >>>> Should you fail to reproduce this issue by creating groups, setting >>>> latencies and starting fio jobs manually, what if you try by just >>>> executing my script? Maybe this could help us spot the culprit more >>>> quickly. >>> >>> Ah ok, you are doing it on a mountpoint. >> >> Yep >> >>> Are you using btrfs? >> >> ext4 >> >>> Cause otherwise >>> you are going to have a sad time. >> >> Could you elaborate more on this? I/O seems to be controllable on ext4. >> >>> The other thing is you are using buffered, >> >> Actually, the problem is suffered by sync random reads, which always >> hit the disk in this test. >> >>> which may or may not hit the disk. This is what I use to test io.latency >>> >>> https://patchwork.kernel.org/patch/10714425/ >>> >>> I had to massage it since it didn't apply directly, but running this against the >>> actual block device, with O_DIRECT so I'm sure to be measure the actual impact >>> of the controller, it all works out fine. >> >> I'm not getting why non-direct sync reads, or buffered writes, should >> be uncontrollable. As a trivial example, BFQ in this tests controls >> I/O as expected, and keeps latency extremely low. >> >> What am I missing? >> > > While waiting for your answer, I've added also the direct-I/O case to > my test. Now we have also this new case reproduced by the command > line reported in the draft. > > Even with direct I/O, nothing changes with writers as interferers, > apart from latency becoming at least equal to the case of no I/O > control for the HDD. Summing up, with writers as interferers (latency > in ms): > > no I/O control io.latency BFQ > NVMe SSD 3 3 0.2 > SATA SSD 3 3 0.2 > HDD 56 56 13 > > In contrast, there are important improvements with the SSDs, in case > of readers as interferers. This is the new situation (latency still > in ms): > > no I/O control io.latency BFQ > NVMe SSD 1.9 0.08 0.07 > SATA SSD 39 0.2 0.7 > HDD 4500 118 11 > I'm sorry, I didn't repeat tests with direct I/O for BFQ too. And results change for BFQ too in case of readers as interferes. Here are all correct figures for readers as interferers (latency in ms): no I/O control io.latency BFQ NVMe SSD 1.9 0.08 0.07 SATA SSD 39 0.2 0.2 HDD 4500 118 10 Thanks, Paolo > Thanks, > Paolo > >> Thanks, >> Paolo >> >>> Thanks, >>> >>> Josef