Re: [ACPI Code First ECN] Enumerate "Inclusive Linear Address Mode" memory-side caches

Dan Williams <dan.j.williams@xxxxxxxxx> · Thu, 23 May 2024 09:36:01 -0700

Jonathan Cameron wrote:
> > 
> > > > > Whilst the CXL side of things (and I assume your hardware migration engine)
> > > > > don't provide a way to recover this, it would be possible to build
> > > > > a system that otherwise looked like you describe that did provide access
> > > > > to the tag bits and so wouldn't present the aliasing problem.    
> > > > 
> > > > Aliasing problem? All direct-mapped caches have aliases, it just happens that
> > > > this address mode allows direct-addressability of at least one alias.  
> > > 
> > > As I understand this the problem is you get address A in the error record,
> > > but that actually means any of A, A + N, A + 2N etc and the issue is you
> > > have no way of recovering which alias you have. 
> > > 
> > > Another implementation might have the same aliasing in the cache, but allow
> > > for establishing which one you have (the hardware inherently has to know that
> > > but I presume in this case doesn't provide a way to look it up - or if it
> > > does, then issue here is that the OS querying of the CXL device doesn't know
> > > about that interface?).  So I think the critical here is that information is
> > > not available, not that aliasing occurs.  
> > 
> > The critical information is that the address range is extended by the cache
> > capacity compared to the typical case. Maybe "extended-linear" is the term I was
> > searching for last Friday when I could not think of a better bikeshed color?
> > 
> > The reason an "extended-linear" indicator is important is for the driver to
> > recognize that the CXL address range programmed into the decoders is only a
> > subset of the system-physical address ranges that may route traffic to CXL. So
> > when the memory-side-cache is in this "extended" mode there are more addresses
> > that may route to CXL.
> 
> I think we need to be careful with decoders here because the extra translation in the
> path means they aren't in HPA space as such.  They are in a new HPA+ space.
> In your case I think the translation is such that addresses are the bottom of the
> HPA window, but they could just as easily be the top of the HPA window or not
> within it at all...

No need for an HPA+ concept. This is just an HPA vs SPA distinction,
similar to what we dealt with here:

0cab68720598 cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Typically HPA and SPA are a 1:1 relationship, but in this case there is
a memory-side cache that sometimes translates the SRAT SPA to CXL HPA vs
DDR HPA. For any given SPA in the SRAT range there is no way to know
whether it is currently dynamically mapped to CXL or DDR.

> |     HPA window 1 - Length = Cache + CXL                 |
> |   HPA+ window 1 - Length = CXL only |

HPA windows are never impacted by this memory side cache addressing.

> 
> or
> |     HPA window 1 - Length = Cache + CXL                |
>                    |   HPA+ window 1 - Length = CXL only |
> 
> or for giggles
> 
> |     HPA window 1 - Length = Cache + CXL                |
>                                                    |   HPA+ window - Length = CXL only |
> 
> last one might seem odd but if you are packing multiple of these you might get
> |     HPA window 1 - Length = Cache + CXL      |   HPA window 2 Ln = Cache + CXL           |
> |   HPA+ window 1 - Length = CXL only |  HPA+ window 2  Len = CXL only|
> 
> To reduce decoder costs in the fabric (yeah we don't do this today but the
> bios might :)

No, BIOS should have no opporunity to confuse "HPA" layout. Let me see
if I can cutoff this line of confusion in the next rev and explicitly
call out SPA vs HPA expectations.

> So should the text say anything about decoder address vs (SRAT / HMAT addressing)
> Maybe reasonable to say it's contained and aligned so modulo maths works?
> This is a bit odd as HMAT wouldn't typically provide this info, but this addressing
> mode already incorporates it sort of...

SRAT portrays capacity, HMAT portrays cache and address organization.
There is no need for bringing CXL decoder concepts into the HMAT.

[..]
> > > > I still disagree with the implication that "inclusion" is a property of the
> > > > cache and not the address layout for this ECN.  
> > > 
> > > It's an ECN about caches - the chance of misunderstanding is high.
> > > Maybe there isn't a better option, but it definitely makes me feel uncomfortable.  
> > [..]
> > > Maybe hyphen will help? Inclusive-linear Address mode?
> > > to avoid reading this as separate adjectives as in that this is an
> > > 'inclusive' cache that has a 'linear address' mode?  
> > 
> > Try this on for size:
> > 
> > * "When Address Mode is 1 'Extended-Linear' it indicates that the associated
> >    address range (SRAT.MemoryAffinityStructure.Length) is comprised of the
> >    backing store capacity extended by the cache capacity. It is arranged such
> >    that there are N directly addressable aliases of a given cacheline where N is
> >    the ratio of target memory proximity domain size and the memory side cache
> >    size. Where the N aliased addresses for a given cacheline all share the same
> >    result for the operation 'address modulo cache size'. This setting is only
> >    allowed when 'Cache Associativity' is 'Direct Map'."  
> > 
> > 
> I don't promise not to change my mind, but today LGTM.

This sounds very similar to the voice that is always in my mind when
reviewing code, reminds me of one of my favorite Star Wars quotes, "I am
altering the deal, pray I do not alter it any further."