On 09/10/2018 02:10 PM, Wang, Huaqiang wrote: > > >> -----Original Message----- >> From: John Ferlan [mailto:jferlan@xxxxxxxxxx] >> Sent: Wednesday, September 5, 2018 11:00 PM >> To: Wang, Huaqiang <huaqiang.wang@xxxxxxxxx>; libvir-list@xxxxxxxxxx >> Cc: Feng, Shaohe <shaohe.feng@xxxxxxxxx>; Niu, Bing <bing.niu@xxxxxxxxx>; >> Ding, Jian-feng <jian-feng.ding@xxxxxxxxx>; Zang, Rui <rui.zang@xxxxxxxxx> >> Subject: Re: [PATCH 06/10] util: Introduce resctrl monitor for CMT >> >> >> >> On 08/27/2018 07:23 AM, Wang Huaqiang wrote: >>> 'virResctrlAllocMon' denotes a resctrl monitor reporting the resource >>> consumption information. >>> >>> This patch introduced the interfaces for resctrl monitor. >>> >>> Relationship of 'resctrl allocation' and 'resctrl monitor': >>> 1. resctrl monitor monitors resources (cache or memory bandwidth) of >>> particular allocation. >>> 2. resctrl allocation may refer to the 'default' allocation if no >>> dedicated resource 'control' applied to it. The 'default' allocation >>> enjoys remaining resource that not allocated. >>> 3. resctrl monitor belongs to 'default' allocation if no 'cachetune' >>> specified in XML file. >>> 4. one resctrl allocation may have several monitors. It is also >>> permitted that there is no resctrl monitor associated with an >>> allocation. >>> >>> Key data structures: >>> >>> + struct _virResctrlAllocMon { >>> + char *id; >>> + char *path; >>> + }; >>> >>> struct _virResctrlAlloc { >>> virObject parent; >>> >>> @@ -276,6 +289,12 @@ struct _virResctrlAlloc { >>> virResctrlAllocMemBWPtr mem_bw; >>> + virResctrlAllocMonPtr *monitors; >>> + size_t nmonitors; >>> } >>> >>> Signed-off-by: Wang Huaqiang <huaqiang.wang@xxxxxxxxx> >>> --- >>> src/libvirt_private.syms | 6 + >>> src/util/virresctrl.c | 361 >> ++++++++++++++++++++++++++++++++++++++++++++++- >>> src/util/virresctrl.h | 31 ++++ >>> 3 files changed, 394 insertions(+), 4 deletions(-) >>> >> >> Similar to the previous patch - there's a bit too much going on for just one patch >> here. >> >> Introducing "default_group" and "monitors". This needs some separation. >> > > Will split this patch into serval patches. > The 'default_group' will be a separate patch. > >> I am not going to look at this patch. I think you really need to describe this >> default_group concept quite a bit more as you'd be totally changing the >> meaning of alloc->path by "removing" everything after the >> SYSFS_RESCTRL_PATH. What's the purpose of alloc->path then in this >> environment? > > This is a bug. > 'default_group' is not allowed to be removed in libvirt. It is created > by the resource control file system at the time of mounting, can only be > removed at the time of file system unmounting by the system. > In next version virResctrlAllocRemove, a check will be made to determine if > it is removing root directory /sys/fs/resctrl, if yes, report an error. Like I've said before - just be sure to properly separate things. Fixing existing issues in the middle of new code changes is just one of those areas that makes reviewing difficult. > > Following paragraph explains the 'default_group' and some relevant concepts. > I'll also write these content into cover of patch series. In much less detail I hope... Not sure I can properly page in the rest. I read it, but I certainly cannot internalize it. John > > The resctrl default group is described initially by Kernel document > 'intel_rdt_ui.txt', with the link > "https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt". > > The default group is the root directory of the resource control file system, > that is, the directory of '/sys/fs/resctrl/'. The default group is created > immediately after the mounting of resctrl file system, and owns all the tasks > and make full use of all resources. > > In these patches, the libvirt virresctrl 'default_group' is introduced by > extending the scope of current 'resource allocation'. virResctrlAlloc is > presenting a 'resource allocation'. 'default_group' is also called > 'default allocation'. > > The 'default allocation' represents the root directory of resource control > file system. The main difference to a non-default allocation is the > 'default group' is created after fs mounting, and owns all system tasks, while > non-default allocation need to create through 'mdkir' and explicit operation > of filling PID into the tasks file. > > The main purpose for introducing 'default allocation' is creating monitors > that for non-default allocation to save hardware resources. > > Think about the case: a KVM virtual machine with 1 working vcpu, we want to > monitor the cache utilization of the vcpu but don't apply any resource > limitaion on vcpu process. We have two possible solutions: > > 1. Create a new "resource allocation" through making a directory under > /sys/fs/resctrl/, assuming the allocation is /sys/fs/resource/p0, > and put the host process PID for the vcpu into file > /fys/fs/resource/p0/tasks. Then Get the cache utilization information through > file /fys/fs/resource/p0/mon_data/llc_occupancy. > > 2. Do not create an extra allocation, but create a resource monitor under > default allocation by creating an sub-directory under > '/sys/fs/resctrl/mon_group', then put the vcpu PID into the monitor's > sub-directory's tasks file. Get the cache utilization information through this > monitor's llc_occupancy file. > > Comparing with slotion1, solution2 uses less hardware resource by saving one > CLOSID. Normally we have much more RMIDs than CLOSIDs, for CPU E5-2699v4, the > number of CLOSID is 16, while RMID number is 176. > > To support 'default_group' the domain's xml configuration file need to be > changed: > > The 'default allocation' has the similar behavior with original libvirt defined > 'resource allocation' for creating a monitor group, getting resource > consumption data from a monitor, as well as the task assignment operations. > The 'default allocation' could be looked as a special 'resource allocation'. > > > Libvirt treats virResctrlAlloc (sometimes called resource allocation) > as the representing of resctrl allocation, each resctrl allocation has the > capability to report the resource consumption information of involved > tasks, through files 'llc_occupancy', 'mbm_total_bytes' and > 'mbm_local_bytes' under the directory of allocation. > virResctrlAllocMon (sometimes called resource monitor) is introduced to > represent virResctrlAlloc's role for resource consumption monitoring. > One or more resource monitors could be created to monitor the > resource utilization for a small set of tasks of current allocation. > This explains why the 'monitor' is being put under 'cachetune' element in > domain's XML configuration file. > > <cputune> > <cachetune vcpus='0-1'> > <cache id='0' level='3' type='code' size='7680' unit='KiB'/> > <cache id='1' level='3' type='data' size='3840' unit='KiB'/> > + <monitor vcpus='0-1'/> > + <monitor vcpus='0'/> > </cachetune> > </cputune> > > The 'defautl_group' is resource allocation shared by all system tasks that do > not have specified resource limitation. In existing libvirt policy no resource > limitation is allowed to put on it. so I need to generate configuration such as > > <cachetune vcpus='3'> > + <monitor vcpus='3'/> > </cachetune> > > In the implementation, this monitor for monitoring domain's vcpu 3 is > created under 'default allocation', default allocation is the directory > /sys/fs/resctrl, which is set up at time of mounting. > > And above is the reason for why element 'cache' is changed being > optional in RNG file. > > <element name="cachetune"> > <attribute name="vcpus"> > <ref name='cpuset'/> > </attribute> > - <oneOrMore> > + <zeroOrMore> > <element name="cache"> > <attribute name="id"> > <ref name='unsignedInt'/> > </attribute> > <attribute name="level"> > <ref name='unsignedInt'/> > </attribute> > <attribute name="type"> > <choice> > <value>both</value> > <value>code</value> > <value>data</value> > </choice> > </attribute> > <attribute name="size"> > <ref name='unsignedLong'/> > </attribute> > <optional> > <attribute name='unit'> > <ref name='unit'/> > </attribute> > </optional> > </element> > - </oneOrMore> > + </zeroOrMore> > <zeroOrMore> > <element name="monitor"> > <attribute name="vcpus"> > <ref name='cpuset'/> > </attribute> > </element> > </zeroOrMore> > </element> > > Thanks for review. > Huaqiang > [...] -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list