Fernando Luis Vázquez Cao wrote: >>> This seems to be the easiest part, but the current cgroups >>> infrastructure has some limitations when it comes to dealing with block >>> devices: impossibility of creating/removing certain control structures >>> dynamically and hardcoding of subsystems (i.e. resource controllers). >>> This makes it difficult to handle block devices that can be hotplugged >>> and go away at any time (this applies not only to usb storage but also >>> to some SATA and SCSI devices). To cope with this situation properly we >>> would need hotplug support in cgroups, but, as suggested before and >>> discussed in the past (see (0) below), there are some limitations. >>> >>> Even in the non-hotplug case it would be nice if we could treat each >>> block I/O device as an independent resource, which means we could do >>> things like allocating I/O bandwidth on a per-device basis. As long as >>> performance is not compromised too much, adding some kind of basic >>> hotplug support to cgroups is probably worth it. >>> >>> (0) http://lkml.org/lkml/2008/5/21/12 >> What about using major,minor numbers to identify each device and account >> IO statistics? If a device is unplugged we could reset IO statistics >> and/or remove IO limitations for that device from userspace (i.e. by a >> deamon), but pluggin/unplugging the device would not be blocked/affected >> in any case. Or am I oversimplifying the problem? > If a resource we want to control (a block device in this case) is > hot-plugged/unplugged the corresponding cgroup-related structures inside > the kernel need to be allocated/freed dynamically, respectively. The > problem is that this is not always possible. For example, with the > current implementation of cgroups it is not possible to treat each block > device as a different cgroup subsytem/resource controlled, because > subsystems are created at compile time. The whole subsystem is created at compile time, but controller data structures are allocated dynamically (i.e. see struct mem_cgroup for memory controller). So, identifying each device with a name, or a key like major,minor, instead of a reference/pointer to a struct could help to handle this in userspace. I mean, if a device is unplugged a userspace daemon can just handle the event and delete the controller data structures allocated for this device, asynchronously, via userspace->kernel interface. And without holding a reference to that particular block device in the kernel. Anyway, implementing a generic interface that would allow to define hooks for hot-pluggable devices (or similar events) in cgroups would be interesting. >>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects >>> >>> The implementation of an I/O scheduling algorithm is to a certain extent >>> influenced by what we are trying to achieve in terms of I/O bandwidth >>> shaping, but, as discussed below, the required accuracy can determine >>> the layer where the I/O controller has to reside. Off the top of my >>> head, there are three basic operations we may want perform: >>> - I/O nice prioritization: ionice-like approach. >>> - Proportional bandwidth scheduling: each process/group of processes >>> has a weight that determines the share of bandwidth they receive. >>> - I/O limiting: set an upper limit to the bandwidth a group of tasks >>> can use. >> Use a deadline-based IO scheduling could be an interesting path to be >> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth >> requirements. > Please note that the only thing we can do is to guarantee minimum > bandwidth requirement when there is contention for an IO resource, which > is precisely what a proportional bandwidth scheduler does. An I missing > something? Correct. Proportional bandwidth automatically allows to guarantee min requirements (instead of IO limiting approach, that needs additional mechanisms to achive this). In any case there's no guarantee for a cgroup/application to sustain i.e. 10MB/s on a certain device, but this is a hard problem anyway, and the best we can do is to try to satisfy "soft" constraints. -Andrea -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel