Tejun Heo <tj@xxxxxxxxxx> 于2020年3月31日周二 下午10:36写道: > > Hello, Weiping. > > On Tue, Mar 31, 2020 at 02:17:06PM +0800, Weiping Zhang wrote: > > Recently I do some cgroup io weight testing, > > https://github.com/dublio/iotrack/wiki/cgroup-io-weight-test > > I think a proper io weight policy > > should consider high weight cgroup's iops, latency and also take whole > > disk's throughput > > into account, that is to say, the policy should do more carfully trade > > off between cgroup's > > IO performance and whole disk's throughput. I know one policy cannot > > do all things perfectly, > > but from the test result nvme-wrr can work well. > > That's w/o iocost QoS targets configured, right? iocost should be able to > achieve similar results as wrr with QoS configured. > Yes, I have not set Qos target. > > From the following test result, nvme-wrr work well for both cgroup's > > latency, iops, and whole > > disk's throughput. > > As I wrote before, the issues I see with wrr are the followings. > > * Hardware dependent. Some will work ok or even fantastic. Many others will do > horribly. > > * Lack of configuration granularity. We can't configure it granular enough to > serve hierarchical configuration. > > * Likely not a huge problem with the deep QD of nvmes but lack of queue depth > control can lead to loss of latency control and thus loss of protection for > low concurrency workloads when pitched against workloads which can saturate > QD. > > All that said, given the feature is available, I don't see any reason to not > allow to use it, but I don't think it fits the cgroup interface model given the > hardware dependency and coarse granularity. For these cases, I think the right > thing to do is using cgroups to provide tagging information - ie. build a > dedicated interface which takes cgroup fd or ino as the tag and associate > configurations that way. There already are other use cases which use cgroup this > way (e.g. perf). > Do you means drop the "io.wrr" or "blkio.wrr" in cgroup, and use a dedicated interface like /dev/xxx or /proc/xxx? I see the perf code: struct fd f = fdget(fd) struct cgroup_subsys_state *css = css_tryget_online_from_dir(f.file->f_path.dentry, &perf_event_cgrp_subsys); Looks can be applied to block cgroup in same way. Thanks your help.