From: Bing Niu <bing.niu@xxxxxxxxx> This series is to introduce RDT memory bandwidth allocation support by extending current virresctrl implementation. The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core. This feature provides a method to control applications which may be over-utilizing bandwidth relative to their priority in environments such as the data-center. The details can be found in Intel's SDM 17.19.7. Kernel supports MBA through resctrl file system same as CAT. Each resctrl group have a MB parameter to control how much memory bandwidth it can utilize in unit of percentage. In this series, MBA is enabled by enhancing existing virresctrl implementation. The policy employed for MBA is similar with CAT: The sum of each MBA group's bandwidth dose not exceed 100%. The enhancement of virresctrl include two main parts: Part 1: Add two new structures virResctrlInfoMemMB and virResctrlAllocMemBW for collecting host system MBA capability and domain memory bandwidth allocation. Those two structures are the extension of existing virResctrlInfo and virResctrlAlloc. With them, virresctrl framework can support MBA and CAT concurrently. Each virResctrlAlloc represent a resource allocation including CAT, or MBA, or CAT&MBA. The policy of MBA is that: total memory bandwidth of each resctrl group, created by virresctrl, does not exceed to 100%. Part 2: On XML part, add new elements to host capabilities query and domain allocation to support memory bandwidth allocation. --------------------------------------------------------------------------------------------- For host capabilities XML, new XML format like below example, <host> ..... <memory_bandwidth> <node id='0' cpus='0-19'> <control granularity='10' min ='10' maxAllocs='8'/> </node> </memory_bandwidth> </host> granularity --- memory bandwidth granularity min --- minimum memory bandwidth allowed maxAllocs --- maximum concurrent memory bandwidth allocation allowed. --------------------------------------------------------------------------------------------- For domain XML, new format as below example <domain type='kvm' id='2'> ...... <cputune> ...... <shares>1024</shares> <memorytune vcpus='0-1'> <node id='0' bandwidth='20'/> </memorytune> </cputune> ...... </domain> id --- node where memory bandwidth allocation will happen bandwidth --- bandwidth allocated in percentage ---------------------------------------------------------------------------------------------- With this extension of the virresctrl, the overall working follow of CAT and MBA is described by below picture. XML parser will aggregate MBA and CAT configuration and represents it in one virresctrl object. The methods of virresctrl class will manipulate resctrl interface to allocate corresponding resources. <memorytune cpus='0-3'> +---------+ | <cachetune vcpus='0-3'> XML | + parser +-----------+ | | +------------------------------+ | | internal object +------v--------------+ virResctrlAlloc | backing object | +------+--------------+ | | +------------------------------+ | +--v-------+ | | | schemata | /sys/fs/resctrl | tasks | | . | | . | | | +----------+ --------------------------------------------------------------------- previous versions and discussion can be found at v1: https://www.redhat.com/archives/libvir-list/2018-July/msg01144.html RFC v2: https://www.redhat.com/archives/libvir-list/2018-June/msg01268.html RFC v1: https://www.redhat.com/archives/libvir-list/2018-May/msg02101.html Changelog: v1 -> this: John's comment: 1. Split calculation of number of memory bandwidth control to one patch. 2. Split virResctrlAllocMemBW relating methods to 5 patch, each provides one kind of function, eg: schemata processing, memory bandwidth calculation..... 3. Use resctrl to replace cachetune in domain conf. 4. Split refactor virDomainCachetuneDefParse into 3 patches. And adjust some logic, eg: use %s format error log, renaming functions..... 5. Complete doc description. eg: update cachetune part about vcpus overlapping with memorytune, update libvirt version info for memory bandwidth control availability. 6. Some coding style fix. RFC_v2->v1: John's comment: 1. use name MemBW to replace MB for a more clear description. 2. split rename patch and put refactor function part separately. 3. split virResctrlInfoMemMB and virResctrlAllocMemBW to different patches. 4. add docs/schemas/*.rng for XML related patches. 5. some cleanup for coding conventions. RFC_ v1->RFC_v2: Jano's comment: 1. put renaming parts into separated patches. 2. set the initial return value as -1. 3. using full name in structure definition. 4. do not use VIR_CACHE_TYPE_LAST for memory bandwidth allocation formatting. Pavel's comment: 1. add host capabilities XML for memory bandwidth allocation. 2. do not mix use cachetune section in XML for memory bandwidth allocation in domain XML. define a dedicated one for memory bandwidth allocation. Bing Niu (17): util: Rename some functions of virresctrl util: Refactor virResctrlGetInfo in virresctrl util: Refactor virResctrlAllocFormat of virresctrl util: Add MBA capability information query to resctrl util: Add MBA check to virResctrlInfoGetCache util: Add MBA allocation to virresctrl util: Add MBA schemata parse and format methods util: Add support to calculate MBA utilization util: Introduce virResctrlAllocForeachMemory util: Introduce virResctrlAllocSetMemoryBandwidth conf: Rename cachetune to resctrl conf: Factor out vcpus parsing part from virDomainCachetuneDefParse conf: Factor out vcpus overlapping from virDomainCachetuneDefParse conf: Factor out virDomainResctrlDef update from virDomainCachetuneDefParse conf: Add support for memorytune XML processing for resctrl MBA conf: Add return value check to virResctrlAllocForeachCache conf: Add memory bandwidth allocation capability of host docs/formatdomain.html.in | 39 +- docs/schemas/capability.rng | 33 ++ docs/schemas/domaincommon.rng | 17 + src/conf/capabilities.c | 107 ++++ src/conf/capabilities.h | 11 + src/conf/domain_conf.c | 427 +++++++++++--- src/conf/domain_conf.h | 10 +- src/libvirt_private.syms | 6 +- src/qemu/qemu_domain.c | 2 +- src/qemu/qemu_process.c | 18 +- src/util/virresctrl.c | 611 +++++++++++++++++++-- src/util/virresctrl.h | 55 +- .../memorytune-colliding-allocs.xml | 30 + .../memorytune-colliding-cachetune.xml | 32 ++ tests/genericxml2xmlindata/memorytune.xml | 33 ++ tests/genericxml2xmltest.c | 5 + .../linux-resctrl/resctrl/info/MB/bandwidth_gran | 1 + .../linux-resctrl/resctrl/info/MB/min_bandwidth | 1 + .../linux-resctrl/resctrl/info/MB/num_closids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 8 + tests/virresctrldata/resctrl.schemata | 1 + 21 files changed, 1279 insertions(+), 169 deletions(-) create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-allocs.xml create mode 100644 tests/genericxml2xmlindata/memorytune-colliding-cachetune.xml create mode 100644 tests/genericxml2xmlindata/memorytune.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/bandwidth_gran create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/min_bandwidth create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/MB/num_closids -- 2.7.4 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list