On Fri, Aug 05, 2022 at 12:32:32PM -0400, John Kacur wrote: > > > On Fri, 5 Aug 2022, Leah Leshchinsky wrote: > > > On Thu, Aug 04, 2022 at 11:48:07AM -0400, Leah Leshchinsky wrote: > > > The run report produced at the end of a run does not contain information > > > on load and measurement thread locations. > > > > > > Adjust MakeReport() functions of LoadModules and MeasurementModules > > > class so that new properties with number of loads and cpu information > > > are added to the XML report and can be read by rteval_text.xsl. > > > > > > Signed-off-by: Leah Leshchinsky <lleshchi@xxxxxxxxxx> > > > > > > diff --git a/rteval/modules/loads/__init__.py b/rteval/modules/loads/__init__.py > > > index 2c2105efa964..5e5a5d85b616 100644 > > > --- a/rteval/modules/loads/__init__.py > > > +++ b/rteval/modules/loads/__init__.py > > > @@ -30,6 +30,7 @@ import libxml2 > > > from rteval.Log import Log > > > from rteval.rtevalConfig import rtevalCfgSection > > > from rteval.modules import RtEvalModules, rtevalModulePrototype > > > +from rteval.systopology import collapse_cpulist, SysTopology > > > > > > class LoadThread(rtevalModulePrototype): > > > def __init__(self, name, config, logger=None): > > > @@ -131,6 +132,12 @@ class LoadModules(RtEvalModules): > > > def MakeReport(self): > > > rep_n = RtEvalModules.MakeReport(self) > > > rep_n.newProp("load_average", str(self.GetLoadAvg())) > > > + rep_n.newProp("loads", str(self.ModulesLoaded())) > > > + cpulist = self._cfg.GetSection(self._module_config).cpulist > > > + if not cpulist: > > > + st = SysTopology() > > > + cpulist = collapse_cpulist(st.online_cpus()) > > > + rep_n.newProp("loadcpus", cpulist) > > > > > > return rep_n > > > > > > diff --git a/rteval/modules/measurement/__init__.py b/rteval/modules/measurement/__init__.py > > > index 318248bd7e35..371fc5d08f44 100644 > > > --- a/rteval/modules/measurement/__init__.py > > > +++ b/rteval/modules/measurement/__init__.py > > > @@ -24,7 +24,7 @@ > > > > > > import libxml2 > > > from rteval.modules import RtEvalModules, ModuleContainer > > > - > > > +from rteval.systopology import collapse_cpulist, SysTopology > > > > > > class MeasurementProfile(RtEvalModules): > > > """Keeps and controls all the measurement modules with the same measurement profile""" > > > @@ -189,6 +189,12 @@ measurement profiles, based on their characteristics""" > > > > > > # Get the reports from all meaurement modules in all measurement profiles > > > rep_n = libxml2.newNode("Measurements") > > > + cpulist = self.__cfg.GetSection("measurement").cpulist > > > + if not cpulist: > > > + st = SysTopology() > > > + cpulist = collapse_cpulist(st.online_cpus()) > > > + rep_n.newProp("measurecpus", cpulist) > > > + > > > for mp in self.__measureprofiles: > > > mprep_n = mp.MakeReport() > > > if mprep_n: > > > diff --git a/rteval/rteval_text.xsl b/rteval/rteval_text.xsl > > > index c40063e3dd19..7ecfac6b6140 100644 > > > --- a/rteval/rteval_text.xsl > > > +++ b/rteval/rteval_text.xsl > > > @@ -13,6 +13,14 @@ > > > <xsl:value-of select="run_info/date"/><xsl:text> </xsl:text><xsl:value-of select="run_info/time"/> > > > <xsl:text> </xsl:text> > > > > > > + <xsl:text> Loads: </xsl:text> > > > + <xsl:value-of select="loads/@loads"/><xsl:text> loads run on cores </xsl:text><xsl:value-of select="loads/@loadcpus"/> > > > + <xsl:text> </xsl:text> > > > + > > > + <xsl:text> Measurement: </xsl:text> > > > + <xsl:text>measurement threads run on cores </xsl:text><xsl:value-of select="Measurements/@measurecpus"/> > > > + <xsl:text> </xsl:text> > > > + > > > <xsl:text> Run time: </xsl:text> > > > <xsl:value-of select="run_info/@days"/><xsl:text> days </xsl:text> > > > <xsl:value-of select="run_info/@hours"/><xsl:text>h </xsl:text> > > > -- > > > 2.31.1 > > > > > > > Hi John, > > > > I do believe the cpulist variable can be empty. I was able to test this by running rteval without specifying and load and measurement cpus. > > Ok, you are correct, cpulist can be empty if the following are not set > --loads-cpulist=LIST > --measurement-cpulist=LIST > > So you do need some code to initialize the list of it is empty. > But there are still problems > > For example if you shut off cpu 3 > su -c 'echo 0 > /sys/devices/system/cpu/cpu3/online ' > > And then request measurement on cpus 0-5 > su -c './rteval-cmd -d5s --measurement-cpulist=0-5' > > The report should say > Measurement: measurement threads run on cores 0-2,4,5 > Loads: 3 loads run on cores 6-11 (the inverse) > but instead with your patch it says > Loads: 3 loads run on cores 1,2,4,6,7,8,9,10,11 > Measurement: measurement threads run on cores 0-5 > > Now there are problems that exist even before your patch, so for example, > the initial info to the screen not included in the report doesn't consider > offline cpus. (We should fix that too of course, but we can do this > stepwise) > > And it's also possible that I have introduced some new problems with the > changes to systopology that will hopefully make things simpler in the long > run but may be interferring with your code. So, there are many things to > untangle here. > > > > > This patch is attempting to replicate the behavior found in rteval/__init.py lines 187-201, which also finds the cpulist using > > cpulist = self._loadmods._cfg.GetSection("loads").cpulist > > and if the cpulist is empty, defaults to reporting the online cpus. I assume this is because the rteval program uses all available cpus > > to run measurement and load threads if no specific cpus are specified. > > > > Please let me know if I am misunderstanding any of the above. > > > > I will investigate why the xsl code is not working for you. It seems to be working for me at the moment. > > > > > > Thank you for the review, > > Leah > > > > > Thanks Leah > > John > Hi John, > The report should say > Measurement: measurement threads run on cores 0-2,4,5 > Loads: 3 loads run on cores 6-11 (the inverse) > but instead with your patch it says > Loads: 3 loads run on cores 1,2,4,6,7,8,9,10,11 > Measurement: measurement threads run on cores 0-5 Now that I'm looking at the rteval program prior to my changes, I see that this issue existed in the initial pre-report posted to stdout. This seems to be a separate bug in the way that 'self._loadmods._cfg.GetSection("loads").cpulist' and self.__cfg.GetSection("measurement").cpulist are calculated. When I test your example on the rteval main branch, this is the result: su -c 'echo 0 > /sys/devices/system/cpu/cpu3/online ' su -c './rteval-cmd -d5s --measurement-cpulist=0-5' started 3 loads on cores 1,2,4,6,7 started measurement threads on cores 0-5 Should we file this as a separate bug, since the scope of it goes beyond my current patch? Thanks, Leah