Re: [RFC PATCHv2 00/10] x86 RDT Cache Monitoring Technology (CMT)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 18, 2018 at 12:19:18PM +0000, Wang, Huaqiang wrote:


-----Original Message-----
From: Martin Kletzander [mailto:mkletzan@xxxxxxxxxx]
Sent: Wednesday, July 18, 2018 8:07 PM
To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx>
Cc: libvir-list@xxxxxxxxxx; Feng, Shaohe <shaohe.feng@xxxxxxxxx>; Niu, Bing
<bing.niu@xxxxxxxxx>; Ding, Jian-feng <jian-feng.ding@xxxxxxxxx>; Zang, Rui
<rui.zang@xxxxxxxxx>
Subject: Re:  [RFC PATCHv2 00/10] x86 RDT Cache Monitoring
Technology (CMT)

On Wed, Jul 18, 2018 at 02:29:32AM +0000, Wang, Huaqiang wrote:
>
>
>> -----Original Message-----
>> From: Martin Kletzander [mailto:mkletzan@xxxxxxxxxx]
>> Sent: Tuesday, July 17, 2018 5:11 PM
>> To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx>
>> Cc: libvir-list@xxxxxxxxxx; Feng, Shaohe <shaohe.feng@xxxxxxxxx>;
>> Niu, Bing <bing.niu@xxxxxxxxx>; Ding, Jian-feng
>> <jian-feng.ding@xxxxxxxxx>; Zang, Rui <rui.zang@xxxxxxxxx>
>> Subject: Re:  [RFC PATCHv2 00/10] x86 RDT Cache Monitoring
>> Technology (CMT)
>>
>> On Tue, Jul 17, 2018 at 07:19:41AM +0000, Wang, Huaqiang wrote:
>> >Hi Martin,
>> >
>> >Thanks for your comments. Please see my reply inline.
>> >
>> >> -----Original Message-----
>> >> From: Martin Kletzander [mailto:mkletzan@xxxxxxxxxx]
>> >> Sent: Tuesday, July 17, 2018 2:27 PM
>> >> To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx>
>> >> Cc: libvir-list@xxxxxxxxxx; Feng, Shaohe <shaohe.feng@xxxxxxxxx>;
>> >> Niu, Bing <bing.niu@xxxxxxxxx>; Ding, Jian-feng
>> >> <jian-feng.ding@xxxxxxxxx>; Zang, Rui <rui.zang@xxxxxxxxx>
>> >> Subject: Re:  [RFC PATCHv2 00/10] x86 RDT Cache
>> >> Monitoring Technology (CMT)
>> >>
>> >> On Mon, Jul 09, 2018 at 03:00:48PM +0800, Wang Huaqiang wrote:
>> >> >
>> >> >This is the V2 of RFC and the POC source code for introducing x86
>> >> >RDT CMT feature, thanks Martin Kletzander for his review and
>> >> >constructive suggestion for V1.
>> >> >
>> >> >This series is trying to provide the similar functions of the
>> >> >perf event based CMT, MBMT and MBML features in reporting cache
>> >> >occupancy, total memory bandwidth utilization and local memory
>> >> >bandwidth utilization information in livirt. Firstly we focus on cmt.
>> >> >
>> >> >x86 RDT Cache Monitoring Technology (CMT) provides a medthod to
>> >> >track the cache occupancy information per CPU thread. We are
>> >> >leveraging the implementation of kernel resctrl filesystem and
>> >> >create our patches on top of that.
>> >> >
>> >> >Describing the functionality from a high level:
>> >> >
>> >> >1. Extend the output of 'domstats' and report CMT inforamtion.
>> >> >
>> >> >Comparing with perf event based CMT implementation in libvirt,
>> >> >this series extends the output of command 'domstat' and reports
>> >> >cache occupancy information like these:
>> >> ><pre>
>> >> >[root@dl-c200 libvirt]# virsh domstats vm3 --cpu-resource
>> >> >Domain: 'vm3'
>> >> >  cpu.cacheoccupancy.vcpus_2.value=4415488
>> >> >  cpu.cacheoccupancy.vcpus_2.vcpus=2
>> >> >  cpu.cacheoccupancy.vcpus_1.value=7839744
>> >> >  cpu.cacheoccupancy.vcpus_1.vcpus=1
>> >> >  cpu.cacheoccupancy.vcpus_0,3.value=53796864
>> >> >  cpu.cacheoccupancy.vcpus_0,3.vcpus=0,3
>> >> ></pre>
>> >> >The vcpus have been arragned into three monitoring groups, these
>> >> >three groups cover vcpu 1, vcpu 2 and vcpus 0,3 respectively.
>> >> >Take an example, the 'cpu.cacheoccupancy.vcpus_0,3.value' reports
>> >> >the cache occupancy information for vcpu 0 and vcpu 3, the
>> >> 'cpu.cacheoccupancy.vcpus_0,3.vcpus'
>> >> >represents the vcpu group information.
>> >> >
>> >> >To address Martin's suggestion "beware as 1-4 is something else
>> >> >than
>> >> >1,4 so you need to differentiate that.", the content of 'vcpus'
>> >> >(cpu.cacheoccupancy.<groupname>.vcpus=xxx) has been specially
>> >> >processed, if vcpus is a continous range, e.g. 0-2, then the
>> >> >output of cpu.cacheoccupancy.vcpus_0-2.vcpus will be like
>> >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0,1,2'
>> >> >instead of
>> >> >'cpu.cacheoccupancy.vcpus_0-2.vcpus=0-2'.
>> >> >Please note that 'vcpus_0-2' is a name of this monitoring group,
>> >> >could be specified any other word from the XML configuration file
>> >> >or lively changed with the command introduced in following part.
>> >> >
>> >>
>> >> One small nit according to the naming (but it shouldn't block any
>> >> reviewers from reviewing, just keep this in mind for next version
>> >> for
>> >> example) is that this is still inconsistent.
>> >
>> >OK.  I'll try to use words such as 'cache', 'cpu resource' and avoid
>> >using 'RDT', 'CMT'.
>> >
>>
>> Oh, you misunderstood, I meant the naming in the domstats output =)
>>
>> >The way domstats are structured when there is something like an
>> >> array could shed some light into this.  What you suggested is
>> >> really kind of hard to parse (although looks better).  What would
>> >> you say to
>> something like this:
>> >>
>> >>   cpu.cacheoccupancy.count = 3
>> >>   cpu.cacheoccupancy.0.value=4415488
>> >>   cpu.cacheoccupancy.0.vcpus=2
>> >>   cpu.cacheoccupancy.0.name=vcpus_2
>> >>   cpu.cacheoccupancy.1.value=7839744
>> >>   cpu.cacheoccupancy.1.vcpus=1
>> >>   cpu.cacheoccupancy.1.name=vcpus_1
>> >>   cpu.cacheoccupancy.2.value=53796864
>> >>   cpu.cacheoccupancy.2.vcpus=0,3
>> >>   cpu.cacheoccupancy.2.name=0,3
>> >>
>> >
>> >Your arrangement looks more reasonable, thanks for your advice.
>> >However, as I mentioned in another email that I sent to libvirt-list
>> >hours ago, the kernel resctrl interface provides cache occupancy
>> >information for each cache block for every resource group.
>> >Maybe we need to expose the cache occupancy for each cache block.
>> >If you agree, we need to refine the 'domstats' output message, how
>> >about this:
>> >
>> >  cpu.cacheoccupancy.count=3
>> >  cpu.cacheoccupancy.0.name=vcpus_2
>> >  cpu.cacheoccupancy.0.vcpus=2
>> >  cpu.cacheoccupancy.0.block.count=2
>> >  cpu.cacheoccupancy.0.block.0.bytes=5488
>> >  cpu.cacheoccupancy.0.block.1. bytes =4410000
>> >  cpu.cacheoccupancy.1.name=vcpus_1
>> >  cpu.cacheoccupancy.1.vcpus=1
>> >  cpu.cacheoccupancy.1.block.count=2
>> >  cpu.cacheoccupancy.1.block.0. bytes =7839744
>> > cpu.cacheoccupancy.1.block.0. bytes =0
>> >  cpu.cacheoccupancy.2.name=0,3
>> >  cpu.cacheoccupancy.2.vcpus=0,3
>> >  cpu.cacheoccupancy.2.block.count=2
>> >  cpu.cacheoccupancy.2.block.0. bytes=53796864
>> > cpu.cacheoccupancy.2.block.1. bytes=0
>> >
>>
>> What do you mean by cache block?  Is that (cache_size / granularity)?
>> In that case it looks fine, I guess (without putting too much thought into it).
>
>No. 'cache block' that I mean is indexed with 'cache id', with the id
>number kept in '/sys/devices/system/cpu/cpu*/cache/index*/id'.
>
>Generally for a two socket server  node, there are two sockets (with
>CPU
>E5-2680 v4, for example) in system, and each socket has a L3 cache, if
>resctrl monitoring group is created (/sys/fs/resctrl/p0, for example),
>you can find the cache occupancy information for these two L3 cache
>areas separately from file
>/sys/fs/resctrl/p0/mon_data/mon_L3_00/llc_occupancy
>and file
>/sys/fs/resctrl/p0/mon_data/mon_L3_01/llc_occupancy
>Cache information for individual socket is meaningful to detect
>performance issues such as workload balancing...etc. We'd better expose
>these details to libvirt users.
>To my knowledge, I am using 'cache block' to describe the CPU cache
>indexed with number found in '/sys/devices/system/cpu/cpu*/cache/index*/id'.
>I welcome suggestion on other kind of naming for it.
>

To be consistent I'd prefer "cache" "cache bank" and "index" or "id".  I don't
have specific requirements, I just don't want to invent new words.  Look at how
it is described in capabilities for example.

Make sense. Then let's use 'id' for the the purpose, and the output would be:

cpu.cacheoccupancy.count=3
cpu.cacheoccupancy.0.name=vcpus_2
cpu.cacheoccupancy.0.vcpus=2
cpu.cacheoccupancy.0.id.count=2
cpu.cacheoccupancy.0.id.0.bytes=5488
cpu.cacheoccupancy.0.id.1.bytes =4410000
cpu.cacheoccupancy.1.name=vcpus_1
cpu.cacheoccupancy.1.vcpus=1
cpu.cacheoccupancy.1.id.count=2
cpu.cacheoccupancy.1.id.0.bytes =7839744
cpu.cacheoccupancy.1.id.1.bytes =0
cpu.cacheoccupancy.2.name=0,3
cpu.cacheoccupancy.2.vcpus=0,3
cpu.cacheoccupancy.2.id.count=2
cpu.cacheoccupancy.2.id.0.bytes=53796864
cpu.cacheoccupancy.2.id.1.bytes=0

How about it?


I'm switching contexts too much and hence I didn't make myself clear.  Since IDs
are not guaranteed to be consecutive, this might be more future-proof:

cpu.cacheoccupancy.count=3
cpu.cacheoccupancy.0.name=vcpus_2
cpu.cacheoccupancy.0.vcpus=2
cpu.cacheoccupancy.0.bank.count=2
cpu.cacheoccupancy.0.bank.0.id=0
cpu.cacheoccupancy.0.bank.0.bytes=5488
cpu.cacheoccupancy.0.bank.1.id=1
cpu.cacheoccupancy.0.bank.1.bytes =4410000
cpu.cacheoccupancy.1.name=vcpus_1
cpu.cacheoccupancy.1.vcpus=1
cpu.cacheoccupancy.1.bank.count=2
cpu.cacheoccupancy.0.bank.0.id=0
cpu.cacheoccupancy.1.bank.0.bytes =7839744
cpu.cacheoccupancy.0.bank.1.id=1
cpu.cacheoccupancy.1.bank.1.bytes =0
cpu.cacheoccupancy.2.name=0,3
cpu.cacheoccupancy.2.vcpus=0,3
cpu.cacheoccupancy.2.bank.count=2
cpu.cacheoccupancy.0.bank.0.id=0
cpu.cacheoccupancy.2.bank.0.bytes=53796864
cpu.cacheoccupancy.0.bank.1.id=1
cpu.cacheoccupancy.2.bank.1.bytes=0

Attachment: signature.asc
Description: Digital signature

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux