On Thu, 23 May 2024 09:36:01 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > Jonathan Cameron wrote: > > > > > > > > > Whilst the CXL side of things (and I assume your hardware migration engine) > > > > > > don't provide a way to recover this, it would be possible to build > > > > > > a system that otherwise looked like you describe that did provide access > > > > > > to the tag bits and so wouldn't present the aliasing problem. > > > > > > > > > > Aliasing problem? All direct-mapped caches have aliases, it just happens that > > > > > this address mode allows direct-addressability of at least one alias. > > > > > > > > As I understand this the problem is you get address A in the error record, > > > > but that actually means any of A, A + N, A + 2N etc and the issue is you > > > > have no way of recovering which alias you have. > > > > > > > > Another implementation might have the same aliasing in the cache, but allow > > > > for establishing which one you have (the hardware inherently has to know that > > > > but I presume in this case doesn't provide a way to look it up - or if it > > > > does, then issue here is that the OS querying of the CXL device doesn't know > > > > about that interface?). So I think the critical here is that information is > > > > not available, not that aliasing occurs. > > > > > > The critical information is that the address range is extended by the cache > > > capacity compared to the typical case. Maybe "extended-linear" is the term I was > > > searching for last Friday when I could not think of a better bikeshed color? > > > > > > The reason an "extended-linear" indicator is important is for the driver to > > > recognize that the CXL address range programmed into the decoders is only a > > > subset of the system-physical address ranges that may route traffic to CXL. So > > > when the memory-side-cache is in this "extended" mode there are more addresses > > > that may route to CXL. > > > > I think we need to be careful with decoders here because the extra translation in the > > path means they aren't in HPA space as such. They are in a new HPA+ space. > > In your case I think the translation is such that addresses are the bottom of the > > HPA window, but they could just as easily be the top of the HPA window or not > > within it at all... > > No need for an HPA+ concept. This is just an HPA vs SPA distinction, > similar to what we dealt with here: > > 0cab68720598 cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window Sure, if we can avoid a reference to 'subset' then I think this is fine - or avoid relating this to decoders at all. > > Typically HPA and SPA are a 1:1 relationship, but in this case there is > a memory-side cache that sometimes translates the SRAT SPA to CXL HPA vs > DDR HPA. For any given SPA in the SRAT range there is no way to know > whether it is currently dynamically mapped to CXL or DDR. > > > | HPA window 1 - Length = Cache + CXL | > > | HPA+ window 1 - Length = CXL only | > > HPA windows are never impacted by this memory side cache addressing. > > > > > > or > > | HPA window 1 - Length = Cache + CXL | > > | HPA+ window 1 - Length = CXL only | > > > > or for giggles > > > > | HPA window 1 - Length = Cache + CXL | > > | HPA+ window - Length = CXL only | > > > > last one might seem odd but if you are packing multiple of these you might get > > | HPA window 1 - Length = Cache + CXL | HPA window 2 Ln = Cache + CXL | > > | HPA+ window 1 - Length = CXL only | HPA+ window 2 Len = CXL only| > > > > To reduce decoder costs in the fabric (yeah we don't do this today but the > > bios might :) > > No, BIOS should have no opporunity to confuse "HPA" layout. Let me see > if I can cutoff this line of confusion in the next rev and explicitly > call out SPA vs HPA expectations. > > > So should the text say anything about decoder address vs (SRAT / HMAT addressing) > > Maybe reasonable to say it's contained and aligned so modulo maths works? > > This is a bit odd as HMAT wouldn't typically provide this info, but this addressing > > mode already incorporates it sort of... > > SRAT portrays capacity, HMAT portrays cache and address organization. > There is no need for bringing CXL decoder concepts into the HMAT. Absolutely - avoid any reference to decoders and we are fine. > > [..] > > > > > I still disagree with the implication that "inclusion" is a property of the > > > > > cache and not the address layout for this ECN. > > > > > > > > It's an ECN about caches - the chance of misunderstanding is high. > > > > Maybe there isn't a better option, but it definitely makes me feel uncomfortable. > > > [..] > > > > Maybe hyphen will help? Inclusive-linear Address mode? > > > > to avoid reading this as separate adjectives as in that this is an > > > > 'inclusive' cache that has a 'linear address' mode? > > > > > > Try this on for size: > > > > > > * "When Address Mode is 1 'Extended-Linear' it indicates that the associated > > > address range (SRAT.MemoryAffinityStructure.Length) is comprised of the > > > backing store capacity extended by the cache capacity. It is arranged such > > > that there are N directly addressable aliases of a given cacheline where N is > > > the ratio of target memory proximity domain size and the memory side cache > > > size. Where the N aliased addresses for a given cacheline all share the same > > > result for the operation 'address modulo cache size'. This setting is only > > > allowed when 'Cache Associativity' is 'Direct Map'." > > > > > > > > I don't promise not to change my mind, but today LGTM. > > This sounds very similar to the voice that is always in my mind when > reviewing code, reminds me of one of my favorite Star Wars quotes, "I am > altering the deal, pray I do not alter it any further." :) Jonathan