On Wed, Jul 18, 2018 at 02:29:32AM +0000, Wang, Huaqiang wrote:
-----Original Message----- From: Martin Kletzander [mailto:mkletzan@xxxxxxxxxx] Sent: Tuesday, July 17, 2018 5:11 PM To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx> Cc: libvir-list@xxxxxxxxxx; Feng, Shaohe <shaohe.feng@xxxxxxxxx>; Niu, Bing <bing.niu@xxxxxxxxx>; Ding, Jian-feng <jian-feng.ding@xxxxxxxxx>; Zang, Rui <rui.zang@xxxxxxxxx> Subject: Re: [RFC PATCHv2 00/10] x86 RDT Cache Monitoring Technology (CMT) On Tue, Jul 17, 2018 at 07:19:41AM +0000, Wang, Huaqiang wrote: >Hi Martin, > >Thanks for your comments. Please see my reply inline. > >> -----Original Message----- >> From: Martin Kletzander [mailto:mkletzan@xxxxxxxxxx] >> Sent: Tuesday, July 17, 2018 2:27 PM >> To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx> >> Cc: libvir-list@xxxxxxxxxx; Feng, Shaohe <shaohe.feng@xxxxxxxxx>; >> Niu, Bing <bing.niu@xxxxxxxxx>; Ding, Jian-feng >> <jian-feng.ding@xxxxxxxxx>; Zang, Rui <rui.zang@xxxxxxxxx> >> Subject: Re: [RFC PATCHv2 00/10] x86 RDT Cache Monitoring >> Technology (CMT) >> >> On Mon, Jul 09, 2018 at 03:00:48PM +0800, Wang Huaqiang wrote: >> > >> >This is the V2 of RFC and the POC source code for introducing x86 >> >RDT CMT feature, thanks Martin Kletzander for his review and >> >constructive suggestion for V1. >> > >> >This series is trying to provide the similar functions of the perf >> >event based CMT, MBMT and MBML features in reporting cache >> >occupancy, total memory bandwidth utilization and local memory >> >bandwidth utilization information in livirt. Firstly we focus on cmt. >> > >> >x86 RDT Cache Monitoring Technology (CMT) provides a medthod to >> >track the cache occupancy information per CPU thread. We are >> >leveraging the implementation of kernel resctrl filesystem and >> >create our patches on top of that. >> > >> >Describing the functionality from a high level: >> > >> >1. Extend the output of 'domstats' and report CMT inforamtion. >> > >> >Comparing with perf event based CMT implementation in libvirt, this >> >series extends the output of command 'domstat' and reports cache >> >occupancy information like these: >> ><pre> >> >[root@dl-c200 libvirt]# virsh domstats vm3 --cpu-resource >> >Domain: 'vm3' >> > cpu.cacheoccupancy.vcpus_2.value=4415488 >> > cpu.cacheoccupancy.vcpus_2.vcpus=2 >> > cpu.cacheoccupancy.vcpus_1.value=7839744 >> > cpu.cacheoccupancy.vcpus_1.vcpus=1 >> > cpu.cacheoccupancy.vcpus_0,3.value=53796864 >> > cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3 >> ></pre> >> >The vcpus have been arragned into three monitoring groups, these >> >three groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively. Take >> >an example, the 'cpu.cacheoccupancy.vcpus_0,3.value' reports the >> >cache occupancy information for vcpu 0 and vcpu 3, the >> 'cpu.cacheoccupancy.vcpus_0,3.vcpus' >> >represents the vcpu group information. >> > >> >To address Martin's suggestion "beware as 1-4 is something else than >> >1,4 so you need to differentiate that.", the content of 'vcpus' >> >(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially >> >processed, if vcpus is a continous range, e.g. 0-2, then the output >> >of cpu.cacheoccupancy.vcpus_0-2.vcpus will be like >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2' >> >instead of >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'. >> >Please note that 'vcpus_0-2' is a name of this monitoring group, >> >could be specified any other word from the XML configuration file or >> >lively changed with the command introduced in following part. >> > >> >> One small nit according to the naming (but it shouldn't block any >> reviewers from reviewing, just keep this in mind for next version for >> example) is that this is still inconsistent. > >OK. I'll try to use words such as 'cache', 'cpu resource' and avoid >using 'RDT', 'CMT'. > Oh, you misunderstood, I meant the naming in the domstats output =) >The way domstats are structured when there is something like an >> array could shed some light into this. What you suggested is really >> kind of hard to parse (although looks better). What would you say to something like this: >> >> cpu.cacheoccupancy.count = 3 >> cpu.cacheoccupancy.0.value=4415488 >> cpu.cacheoccupancy.0.vcpus=2 >> cpu.cacheoccupancy.0.name=vcpus_2 >> cpu.cacheoccupancy.1.value=7839744 >> cpu.cacheoccupancy.1.vcpus=1 >> cpu.cacheoccupancy.1.name=vcpus_1 >> cpu.cacheoccupancy.2.value=53796864 >> cpu.cacheoccupancy.2.vcpus=0,3 >> cpu.cacheoccupancy.2.name=0,3 >> > >Your arrangement looks more reasonable, thanks for your advice. >However, as I mentioned in another email that I sent to libvirt-list >hours ago, the kernel resctrl interface provides cache occupancy >information for each cache block for every resource group. >Maybe we need to expose the cache occupancy for each cache block. >If you agree, we need to refine the 'domstats' output message, how >about this: > > cpu.cacheoccupancy.count=3 > cpu.cacheoccupancy.0.name=vcpus_2 > cpu.cacheoccupancy.0.vcpus=2 > cpu.cacheoccupancy.0.block.count=2 > cpu.cacheoccupancy.0.block.0.bytes=5488 > cpu.cacheoccupancy.0.block.1. bytes =4410000 > cpu.cacheoccupancy.1.name=vcpus_1 > cpu.cacheoccupancy.1.vcpus=1 > cpu.cacheoccupancy.1.block.count=2 > cpu.cacheoccupancy.1.block.0. bytes =7839744 > cpu.cacheoccupancy.1.block.0. bytes =0 > cpu.cacheoccupancy.2.name=0,3 > cpu.cacheoccupancy.2.vcpus=0,3 > cpu.cacheoccupancy.2.block.count=2 > cpu.cacheoccupancy.2.block.0. bytes=53796864 > cpu.cacheoccupancy.2.block.1. bytes=0 > What do you mean by cache block? Is that (cache_size / granularity)? In that case it looks fine, I guess (without putting too much thought into it).No. 'cache block' that I mean is indexed with 'cache id', with the id number kept in '/sys/devices/system/cpu/cpu*/cache/index*/id'. Generally for a two socket server node, there are two sockets (with CPU E5-2680 v4, for example) in system, and each socket has a L3 cache, if resctrl monitoring group is created (/sys/fs/resctrl/p0, for example), you can find the cache occupancy information for these two L3 cache areas separately from file /sys/fs/resctrl/p0/mon_data/mon_L3_00/llc_occupancy and file /sys/fs/resctrl/p0/mon_data/mon_L3_01/llc_occupancy Cache information for individual socket is meaningful to detect performance issues such as workload balancing...etc. We'd better expose these details to libvirt users. To my knowledge, I am using 'cache block' to describe the CPU cache indexed with number found in '/sys/devices/system/cpu/cpu*/cache/index*/id'. I welcome suggestion on other kind of naming for it.
To be consistent I'd prefer "cache" "cache bank" and "index" or "id". I don't have specific requirements, I just don't want to invent new words. Look at how it is described in capabilities for example.
Martin
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list