* Daniel P. Berrange <berrange@xxxxxxxxxx> [2010-08-24 11:02:44]: > On Tue, Aug 24, 2010 at 01:05:26PM +0530, Balbir Singh wrote: > > * Nikunj A. Dadhania <nikunj@xxxxxxxxxxxxxxxxxx> [2010-08-24 11:53:27]: > > > > > > > > Subject: [RFC] Memory controller exploitation in libvirt > > > > > > Memory CGroup is a kernel feature that can be exploited effectively in the > > > current libvirt/qemu driver. Here is a shot at that. > > > > > > At present, QEmu uses memory ballooning feature, where the memory can be > > > inflated/deflated as and when needed, co-operatively between the host and > > > the guest. There should be some mechanism where the host can have more > > > control over the guests memory usage. Memory CGroup provides features such > > > as hard-limit and soft-limit for memory, and hard-limit for swap area. > > > > > > Design 1: Provide new API and XML changes for resource management > > > ================================================================= > > > > > > All the memory controller tunables are not supported with the current > > > abstractions provided by the libvirt API. libvirt works on various OS. This > > > new API will support GNU/Linux initially and as and when other platforms > > > starts supporting memory tunables, the interface could be enabled for > > > them. Adding following two function pointer to the virDriver interface. > > > > > > 1) domainSetMemoryParameters: which would take one or more name-value > > > pairs. This makes the API extensible, and agnostic to the kind of > > > parameters supported by various Hypervisors. > > > 2) domainGetMemoryParameters: For getting current memory parameters > > > > > > Corresponding libvirt public API: > > > int virDomainSetMemoryParamters (virDomainPtr domain, > > > virMemoryParamterPtr params, > > > unsigned int nparams); > > > int virDomainGetMemoryParamters (virDomainPtr domain, > > > virMemoryParamterPtr params, > > > unsigned int nparams); > > > > > > > > > > Does nparams imply setting several parameters together? Does bulk > > loading help? I would prefer splitting out the API if possible > > into > > > > virCgroupSetMemory() - already present in src/util/cgroup.c > > virCgroupGetMemory() - already present in src/util/cgroup.c > > virCgroupSetMemorySoftLimit() > > virCgroupSetMemoryHardLimit() > > virCgroupSetMemorySwapHardLimit() > > virCgroupGetStats() > > Nope, we don't want cgroups exposed in the public API, since this > has to be applicable to the VMWare and OpenVZ drivers too. > I am not talking about exposing these as public API, but be a part of src/util/cgroup.c and utilized by the qemu driver. It is good to abstract out the OS independent parts, but my concern was double exposure through API like driver->setMemory() that is currently used and the newer API. > > > Parameter list supported: > > > > > > MemoryHardLimits (memory.limits_in_bytes) - Maximum memory > > > MemorySoftLimits (memory.softlimit_in_bytes) - Desired memory > > > > Soft limits allows you to set memory limit on contention. > > > > > MemoryMinimumGaurantee - Minimum memory required (without this amount of > > > memory, VM should not be started) > > > > > > SwapHardLimits (memory.memsw_limit_in_bytes) - Maximum swap > > > SwapSoftLimits (Currently not supported by kernel) - Desired swap space > > > > > > > We *dont* support SwapSoftLimits in the memory cgroup controller with > > no plans to support it in the future either at this point. The > > semantics are just too hard to get right at the moment. > > That's not a huge problem. Since we have many hypervisors to support > in libvirt, I expect the set of tunables will expand over time, and > not every hypervisor driver in libvirt will support every tunable. > They'll just pick the tunables that apply to them. We can leave > SwapSoftLimits out of the public API until we find a HV that needs > it > > > > > > Tunables memory.limit_in_bytes, memory.softlimit_in_bytes and > > > memory.memsw_limit_in_bytes are provided by the memory controller in the > > > Linux kernel. > > > > > > I am not an expert here, so just listing what new elements need to be added > > > to the XML schema: > > > > > > <define name="resource"> > > > <element memory> > > > <element memoryHardLimit/> > > > <element memorySoftLimit/> > > > <element memoryMinGaurantee/> > > > <element swapHardLimit/> > > > <element swapSoftLimit/> > > > </element> > > > </define> > > > > > > > I'd prefer a syntax that integrates well with what we currently have > > > > <cgroup> > > <path>...</path> > > <controller> > > <name>..</name> > > <soft limit>...</> > > <hard limit>...</> > > </controller> > > ... > > </cgroup> > > That is exposing far too much info about the cgroups implementation > details. The XML representation needs to be decouple from the > implementation. > Don't we already expose a lot of information about qemu for example about vhost net's or cmdline's/virtio etc in the qemu configuration of a guest. I am not opposed to having a higher level abstraction but concerned that some of the nitty-gritty details like swappiness (yes that is a tunable) or the interpretation of stats might vary widely across operating systems. Hence, I felt it is better to expose it as a part of the qemu-cgroup-linux driver combo. -- Three Cheers, Balbir -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list