Hi James, Sorry for the delay. It took a while to get back into this. > -----Original Message----- > From: James Morse [mailto:james.morse@xxxxxxx] > Sent: 19 July 2019 16:30 > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx> > Cc: Vijaya Kumar K <vkilari@xxxxxxxxxxxxxx>; Lorenzo Pieralisi > <lorenzo.pieralisi@xxxxxxx>; Tomasz Nowicki > <Tomasz.Nowicki@xxxxxxxxxx>; Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx>; > Guohanjun (Hanjun Guo) <guohanjun@xxxxxxxxxx>; Linuxarm > <linuxarm@xxxxxxxxxx>; Jeremy Linton <jeremy.linton@xxxxxxx>; > linux-acpi@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; Sudeep Holla > <sudeep.holla@xxxxxxx>; wangxiongfeng (C) > <wangxiongfeng2@xxxxxxxxxx>; Richard Ruigrok > <rruigrok@xxxxxxxxxxxxxxxx> > Subject: Re: MPAM branch verification (was RE: [RFC PATCH 2/2] ACPI / PPTT: > cacheinfo: Label caches based on fw_token) > > Hi Shameer, > > On 03/07/2019 13:27, Shameerali Kolothum Thodi wrote: > >> -----Original Message----- > >> On 21/06/2019 16:57, Shameerali Kolothum Thodi wrote: > >>>> -----Original Message----- > >>>> From: James Morse [mailto:james.morse@xxxxxxx] > > >> The domid bitfield not being big enough for the width of the cacheinfo id field > >> looks like > >> a bug in the existing resctrl code. Could you spin that as a patch against > >> mainline? > > > > Yes it could be a bug. But I am not sure about the assumption on x86 > platforms with > > respect to cache id width. Also any need to consider 32 bit systems at all or > not. > > > >> It won't affect any x86 system, but I don't want to 'fix' anything as part of > the > >> mpam > >> support. > > > > Does that mean the cache id width on x86 will never be >14 bits? > > I have no idea. Today they're 0,1,2, so its unlikely?, but > Documentation/x86/resctrl.rst's > "Cache IDs" section says "it isn't guaranteed to be a contiguous sequence", so > maybe? > > The problem is 'struct cacheinfo's id field is an int, its exposed via sysfs as an > int, > but resctrl packs it into a smaller size. That's going to bite one day, it would be > good > to fix it now we know its a problem. > > > >> We almost certainly need to compress the cache-id numbers down to {0,1,2} > if > >> only so we > >> haven't filled all the exposed bits on day-1. (so it might not matter for arm64 > >> either...) > > > > That will be nice if we can compress it like that> I think we can leave the fix > for now > > and come up with a solution when things gets really going. > > > > Mean time I am trying to probe memory controller as well on our system and > it looks > > like there are still issues. > > Typo in the MBA picking code? Should be: > | if (!mpam_has_feature(mpam_feat_mbw_part, class->features) && > | !mpam_has_feature(mpam_feat_mbw_max, class->features)) { > > It can do something useful with either of those features, but the (!part || !max) > previously forced it to have both. > > (This still doesn't work on the model as its describing a 0-bit bitmap > MBW_PART) I think what happens on our hardware is, the MBA reports PMG_MAX = 0 and that upsets mpam_pmg_bits() -->ilog2(). I am not entirely sure whether PMG_MAX= 0 is allowed as per spec when the resource reports HAS_MSMON =1. But hasn't found anything in spec that forbids this as the filter is a combination of PRATID:PMG. I have a temp hack here to keep it going, https://github.com/hisilicon/kernel-dev/commit/5e0881c4cdded4066dfac7603c53242385417a3a > > > I will debug and update if it really is a problem. Please > > let me know if you have any plans to update the branch so that I can try the > latest. > > I hope to push a new version by the end of June. (whoosh! There goes June). > http://www.linux-arm.org/git?p=linux-jm.git;a=shortlog;h=refs/heads/mpam/s > napshot/jun Thanks for that. I am using this now. (And I see a more recent one mpam/5.3-tmp now. Has anything changed other than rebase?) > > The changes in there are to avoid the known-issues when the same 'thing' is > picked as both > L3 resource and the MBA resource. Now with the above fix for PMG_MAX=0, I am hitting another issue. mount -t resctrl resctrl /sys/fs/resctrl fails with "File exists" error. Debugging points to, rdt_get_tree() mkdir_mondata_all() mkdir_mondata_subdir_alldom() mkdir_mondata_subdir() mon_addfile() It looks like r->evt_list gets corrupted somehow and has duplicate entries. I haven’t gone into the bottom of this issue, but please let me know if you have any idea. Cheers, Shameer > I think the risk of sleeping-while-atomic if not all mpam:devices are accessible > from all > CPUs in the resctrl:domain is my next highest priority issue... > > > Thanks, > > James