This series of patches introduced the x86 Cache Monitoring Technology (CMT) to libvirt by interacting with kernel resource control (resctrl) interface. CMT is one of the Intel(R) x86 CPU feature which belongs to the Resource Director Technology (RDT). CMT reports the occupancy of the last level cache, which is shared by all CPU cores. We have serval discussion about the enabling of CMT, please refer to following links for the RFCs. RFCv3 https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html RFCv2 https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html RFCv1 https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html 1. About reason why CMT is necessary in libvirt? The perf events of 'CMT, MBML, MBMT' have been phased out since Linux kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt the perf based cmt,mbm will not work with the latest linux kernel. These patches add CMT feature to libvirt through kernel resctrlfs interface. 2. Interfaces for CMT from the high level. 2.1 Query the host capability of CMT. The element 'monitor' represents the host capabilities of CMT. The explanations of involved CMT attributes: - 'maxAllocs' denotes the maximum monitoring groups could be created, which is limited by the number of hardware 'RMID'. - 'threshold' denotes the upper bound of cache occupancy for current group, in bytes, to determine if an RMID can be reused. - element 'feature' denotes the monitoring feature supported. - 'llc_occupancy' is the feature for reporting the last level cache occupancy information. # virsh capabilities ... <cache> <bank id='0' level='3' type='both' size='15' unit='MiB' cpus='0-5'> <control granularity='768' unit='KiB' type='code' maxAllocs='8'/> <control granularity='768' unit='KiB' type='data' maxAllocs='8'/> + <monitor threshold='540672' unit='B' maxAllocs='176'/> + <feature name=llc_occupancy/> + </monitor> </bank> <bank id='1' level='3' type='both' size='15' unit='MiB' cpus='6-11'> <control granularity='768' unit='KiB' type='code' maxAllocs='8'/> <control granularity='768' unit='KiB' type='data' maxAllocs='8'/> + <monitor threshold='540672' unit='B' maxAllocs='176'/> + <feature name=llc_occupancy/> + </monitor> </bank> </cache> ... 2.2 Create cache monitoring group (cache monitor). The main interface for creating monitoring group is through XML file. The proposed configuration is like: <cputune> <cachetune vcpus='1'> <cache id='0' level='3' type='code' size='7680' unit='KiB'/> <cache id='1' level='3' type='data' size='3840' unit='KiB'/> + <monitor vcpus='1'/> </cachetune> <cachetune vcpus='4-7'> + <monitor vcpus='4-6'/> </cachetune> </cputune> In above XML, created 2 cache resctrl allocation groups and 2 resctrl monitoring groups. The changes of cache monitor will be effective in next booting of VM. 2.3 Show CMT result through command 'domstats' Adding the interface in qemu to report this information for resource monitor group through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6 cpu.cache.1.vcpus=4,5,6 cpu.cache.1.bank.count=2 cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176 **Changes Since RFCv3** In the output of 'domstats', added 'cpu.cache.<cmt_group_index>.bank.<bank_index>.id' to tell the OS assigned cache bank id of current cache. Changes is prefixed with a '+': # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.0.name=vcpus_1 cpu.cache.0.vcpus=1 cpu.cache.0.bank.count=2 + cpu.cache.0.bank.0.id=0 cpu.cache.0.bank.0.bytes=4505600 + cpu.cache.0.bank.1.id=1 cpu.cache.0.bank.1.bytes=5586944 cpu.cache.1.name=vcpus_4-6 cpu.cache.1.vcpus=4,5,6 cpu.cache.1.bank.count=2 + cpu.cache.1.bank.0.id=0 cpu.cache.1.bank.0.bytes=17571840 + cpu.cache.1.bank.1.id=1 cpu.cache.1.bank.1.bytes=29106176 Wang Huaqiang (10): conf: Renamed 'controlBuf' to 'childrenBuf' util: add interface retrieving CMT capability conf: Add CMT capability to host test: add test case for resctrl monitor util: resctrl: refactoring some functions util: Introduce resctrl monitor for CMT conf: refactor virDomainResctrlAppend conf: introduce resctrl monitor group in domain qemu: Introduce resctrl monitoring group qemu: Report cache occupancy (CMT) with domstats .gnulib | 1 - docs/formatdomain.html.in | 14 +- docs/schemas/capability.rng | 28 + docs/schemas/domaincommon.rng | 11 +- src/conf/capabilities.c | 51 +- src/conf/capabilities.h | 1 + src/conf/domain_conf.c | 159 +++++- src/conf/domain_conf.h | 20 + src/libvirt-domain.c | 9 + src/libvirt_private.syms | 6 + src/qemu/qemu_driver.c | 265 ++++++++- src/qemu/qemu_process.c | 40 +- src/util/virresctrl.c | 597 +++++++++++++++++++-- src/util/virresctrl.h | 48 +- tests/genericxml2xmlindata/cachetune-cdp.xml | 2 + .../cachetune-colliding-monitors.xml | 36 ++ tests/genericxml2xmlindata/cachetune-small.xml | 1 + tests/genericxml2xmlindata/cachetune.xml | 3 + tests/genericxml2xmltest.c | 4 + .../resctrl/info/L3_MON/max_threshold_occupancy | 1 + .../linux-resctrl/resctrl/info/L3_MON/mon_features | 3 + .../linux-resctrl/resctrl/info/L3_MON/num_rmids | 1 + tests/vircaps2xmldata/vircaps-x86_64-resctrl.xml | 6 + 23 files changed, 1208 insertions(+), 99 deletions(-) delete mode 160000 .gnulib create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitors.xml create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/max_threshold_occupancy create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/mon_features create mode 100644 tests/vircaps2xmldata/linux-resctrl/resctrl/info/L3_MON/num_rmids -- 2.7.4 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list