On Fri, Jan 10, 2014 at 06:55:11PM +0000, Waskiewicz Jr, Peter P wrote: > I've spoken with the CPU architect, and he's set me straight. I was > getting some simulation data and reality mixed up, so apologies. > > The cacheline is tagged with the RMID being tracked when it's brought > into the cache. That is the only time it's tagged, it does not get > updated (I was looking at data showing impacts if it was updated). > > If there are frequent RMID updates for a particular process, then there > is the possibility that any remaining old data for that process can be > accounted for on a different RMID. This really is workload dependent, > and my architect provided their data showing that this occurrence is > pretty much in the noise. What change frequency and what sided workloads did they test? I can make it significant; take a multi-threaded workload that mostly fits in cache, then assign all theads but one RMDI 0, then fairly quickly rotate RMID 1 between the threads. The problem is, since there's a limited number of RMIDs we have to rotate at some point, but since changing RMIDs is nondeterministic we can't. > Also, I did ask about the granularity of the RMID, and it is > per-cacheline. So if there is a non-exclusive cacheline, then the > occupancy data in the other part of the cacheline will count against the > RMID. One more question: u64 i; u64 rmid_val[]; for (i = 0; i < rmid_max; i++) { wrmsr(IA32_QM_EVTSEL, 1 | (i << 32)); rdmsr(IA32_QM_CTR, rmid_val[i]); } Is this the right way of reading these values? I couldn't find anything that says the event must 'run' to accumulate a value at all, so all it seems it a direct value read with a multiplexer to the RMID. > > So my current mental model would tag a line with the current (ASSOC) > > RMID on: > > - load from DRAM -> L*, even for non-exclusive > > - any to exclusive transition > > > > The result of such rules is that when the effective RMID of a task > > changes it takes an indeterminate amount of time before the residency > > stats reflect reality again. > > > > Furthermore; the IA32_QM_CTR is a misnomer as its a VALUE not a COUNTER. > > Not to mention the entire SDM 17.14.2 section is a mess; it purports to > > describe how to detect the thing using CPUID but then also maybe > > describes how to program it. > > I've given this feedback to the section owner in the SDM. There is an > update due this month, and there will be some updates to this section > (along with some additions). > > I should have my alternate implementation sent out shortly, just working > a few kinks out of it. This is the proc-based and sysfs-based interface > that will rely on a userspace program to handle the logic of grouping > and assigning stuff together. I've not figured out how to deal with this stuff yet; exposing RMIDs to userspace is a guaranteed fail though. Any interface that disallows the kernel to manage the RMIDs is broken. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers