Hi Paolo, On 18/4/24 20:12, Paolo Valente wrote: > > >> Il giorno 23 apr 2018, alle ore 11:01, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto: >> >> >> >> On 18/4/23 15:35, Paolo Valente wrote: >>> >>> >>>> Il giorno 23 apr 2018, alle ore 08:05, Joseph Qi <jiangqi903@xxxxxxxxx> ha scritto: >>>> >>>> Hi Paolo, >>> >>> Hi Joseph, >>> thanks for chiming in. >>> >>>> What's your idle and latency config? >>> >>> I didn't set them at all, as the only (explicit) requirement in my >>> basic test is that one of the group is guaranteed a minimum bps. >>> >>> >>>> IMO, io.low will allow others run more bandwidth if cgroup's average >>>> idle time is high or latency is low. >>> >>> What you say here makes me think that I simply misunderstood the >>> purpose of io.low. So, here is my problem/question: "I only need to >>> guarantee at least a minimum bandwidth, in bps, to a group. Is the >>> io.low limit the way to go?" >>> >>> I know that I can use just io.max (unless I misunderstood the goal of >>> io.max too :( ), but my extra purpose would be to not waste bandwidth >>> when some group is idle. Yet, as for now, io.low is not working even >>> for the first, simpler goal, i.e., guaranteeing a minimum bandwidth to >>> one group when all groups are active. >>> >>> Am I getting something wrong? >>> >>> Otherwise, if there are some special values for idle and latency >>> parameters that would make throttle work for my test, I'll be of >>> course happy to try them. >>> >> I think you can try idle time with 1000us for all cgroups, and latency >> target 100us for cgroup with low limit 100MB/s and 2000us for cgroups >> with low limit 10MB/s. That means cgroup with low latency target will >> be preferred. >> BTW, from my expeierence the parameters are not easy to set because >> they are strongly correlated to the cgroup IO behavior. >> > > +Tejun (I guess he might be interested in the results below) > > Hi Joseph, > thanks for chiming in. Your suggestion did work! > > At first, I thought I had also understood the use of latency from the > outcome of your suggestion: "want low limit really guaranteed for a > group? set target latency to a low value for it." But then, as a > crosscheck, I repeated the same exact test, but reversing target > latencies: I gave 2000 to the interfered (the group with 100MB/s > limit) and 100 to the interferers. And the interfered still got more > than 100MB/s! So I exaggerated: 20000 to the interfered. > Same outcome :( > > I tried really many other combinations, to try to figure this out, but > results seemed more or less random w.r.t. to latency values. I > didn't even start to test different values for idle. > > So, the only sound lesson that I seem to have learned is: if I want > low limits to be enforced, I have to set target latency and idle > explicitly. The actual values of latencies matter little, or not at > all. At least this holds for my simple tests. > > At any rate, thanks to your help, Joseph, I could move to the most > interesting part for me: how effective is blk-throttle with low > limits? I could well be wrong again, but my results do not seem that > good. With the simplest type of non-toy example I considered, I > recorded throughput losses, apparently caused mainly by blk-throttle, > and ranging from 64% to 75%. > > Here is a worst-case example. For each step, I'm reporting below the > command by which you can reproduce that step with the > thr-lat-with-interference benchmark of the S suite [1]. I just split > bandwidth equally among five groups, on my SSD. The device showed a > peak rate of ~515MB/s in this test, so I set rpbs to 100MB/s for each > group (and tried various values, and combinations of values, for the > target latency, without any effect on the results). To begin, I made > every group do sequential reads. Everything worked perfectly fine. > > But then I made one group do random I/O [2], and troubles began. Even > if the group doing random I/O was given a target latency of 100usec > (or lower), while the other had a target latency of 2000usec, the poor > random-I/O group got only 4.7 MB/s! (A single process doing 4k sync > random I/O reaches 25MB/s on my SSD.) > > I guess things broke because low limits did not comply any longer with > the lower speed that device reached with the new, mixed workload: the > device reached 376MB/s, while the sum of the low limits was 500MB/s. > BTW the 'fault' for this loss of throughput was not only of the device > and the workload: if I switched throttling off, then the device still > reached its peak rate, although granting only 1.3MB/s to the > random-I/O group. > > So, to comply with the 376MB/s, I lowered the low limits to 74MB/s per > group (to avoid a too tight 75MB/s) [3]. A little better: the > random-I/O group got 7.2 MB/s. But the total throughput went down > further, to 289MB/s, and became again lower than the sum of the low > limits. Most certainly, this time the throughput went down mainly > because blk-throttling was serving the random I/O more than before. > > To make a long story short, I arrived to setting just 12MB/s as low > limit for each group [4]. The random-I/O group was finally happy, > with a revitalizing 12.77MB/s. But the total throughput dropped down > to 127MB/s, i.e., ~25% of the peak rate of the device. Now the > 'fault' for the throughput loss seemed undoubtedly of blk-throttle. > The latter was evidently over-throttling some group. > > To sum up, for my device, 12MB/s seems to be the highest value for > which low limits can be guaranteed. But setting these limits entails > a high cost: if just one group really does random I/O, then 75% of the > throughput is lost. > > There would be other issues too. For example, 12MB/s might be too > little for the needs of some group in some time period. This fact would > make it extremely difficult, if ever possible, to set low limits that > comply with the needs of more dynamic (and probably more > realistic) workloads than the above one. > Could you run blktrace as well when testing your case? There are several throtl traces to help analyze whether it is caused by frequently upgrade/downgrade. If all cgroups are just running under low, I'am afraid the case you tested has something to do with how SSD handle mixed workload IOs. Thanks, Joseph > I think this is all, sorry for the long mail, I tried to shrink it as > much as possible. Looking forward to some feedback. > > Thanks, > Paolo > > [1] https://github.com/Algodev-github/S > [2] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 100M -W 100M -t randread -L 2000 > [3] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 74M -W 74M -t randread -L 2000 > [4] sudo ./thr-lat-with-interference.sh -b t -n 4 -w 12M -W 12M -t randread -L 2000 >