> Il giorno 16 ago 2019, alle ore 20:17, Paolo Valente <paolo.valente@xxxxxxxxxx> ha scritto: > > > >> Il giorno 16 ago 2019, alle ore 19:59, Josef Bacik <josef@xxxxxxxxxxxxxx> ha scritto: >> >> On Fri, Aug 16, 2019 at 07:52:40PM +0200, Paolo Valente wrote: >>> >>> >>>> Il giorno 16 ago 2019, alle ore 15:21, Josef Bacik <josef@xxxxxxxxxxxxxx> ha scritto: >>>> >>>> On Fri, Aug 16, 2019 at 12:57:41PM +0200, Paolo Valente wrote: >>>>> Hi, >>>>> I happened to test the io.latency controller, to make a comparison >>>>> between this controller and BFQ. But io.latency seems not to work, >>>>> i.e., not to reduce latency compared with what happens with no I/O >>>>> control at all. Here is a summary of the results for one of the >>>>> workloads I tested, on three different devices (latencies in ms): >>>>> >>>>> no I/O control io.latency BFQ >>>>> NVMe SSD 1.9 1.9 0.07 >>>>> SATA SSD 39 56 0.7 >>>>> HDD 4500 4500 11 >>>>> >>>>> I have put all details on hardware, OS, scenarios and results in the >>>>> attached pdf. For your convenience, I'm pasting the source file too. >>>>> >>>> >>>> Do you have the fio jobs you use for this? >>> >>> The script mentioned in the draft (executed with the command line >>> reported in the draft), executes one fio instance for the target >>> process, and one fio instance for each interferer. I couldn't do with >>> just one fio instance executing all jobs, because the weight parameter >>> doesn't work in fio jobfiles for some reason, and because the ioprio >>> class cannot be set for individual jobs. >>> >>> In particular, the script generates a job with the following >>> parameters for the target process: >>> >>> ioengine=sync >>> loops=10000 >>> direct=0 >>> readwrite=randread >>> fdatasync=0 >>> bs=4k >>> thread=0 >>> filename=/mnt/scsi_debug/largefile_interfered0 >>> iodepth=1 >>> numjobs=1 >>> invalidate=1 >>> >>> and a job with the following parameters for each of the interferers, >>> in case, e.g., of a workload made of reads: >>> >>> ioengine=sync >>> direct=0 >>> readwrite=read >>> fdatasync=0 >>> bs=4k >>> filename=/mnt/scsi_debug/largefileX >>> invalidate=1 >>> >>> Should you fail to reproduce this issue by creating groups, setting >>> latencies and starting fio jobs manually, what if you try by just >>> executing my script? Maybe this could help us spot the culprit more >>> quickly. >> >> Ah ok, you are doing it on a mountpoint. > > Yep > >> Are you using btrfs? > > ext4 > >> Cause otherwise >> you are going to have a sad time. > > Could you elaborate more on this? I/O seems to be controllable on ext4. > >> The other thing is you are using buffered, > > Actually, the problem is suffered by sync random reads, which always > hit the disk in this test. > >> which may or may not hit the disk. This is what I use to test io.latency >> >> https://patchwork.kernel.org/patch/10714425/ >> >> I had to massage it since it didn't apply directly, but running this against the >> actual block device, with O_DIRECT so I'm sure to be measure the actual impact >> of the controller, it all works out fine. > > I'm not getting why non-direct sync reads, or buffered writes, should > be uncontrollable. As a trivial example, BFQ in this tests controls > I/O as expected, and keeps latency extremely low. > > What am I missing? > While waiting for your answer, I've added also the direct-I/O case to my test. Now we have also this new case reproduced by the command line reported in the draft. Even with direct I/O, nothing changes with writers as interferers, apart from latency becoming at least equal to the case of no I/O control for the HDD. Summing up, with writers as interferers (latency in ms): no I/O control io.latency BFQ NVMe SSD 3 3 0.2 SATA SSD 3 3 0.2 HDD 56 56 13 In contrast, there are important improvements with the SSDs, in case of readers as interferers. This is the new situation (latency still in ms): no I/O control io.latency BFQ NVMe SSD 1.9 0.08 0.07 SATA SSD 39 0.2 0.7 HDD 4500 118 11 Thanks, Paolo > Thanks, > Paolo > >> Thanks, >> >> Josef