Jonathan Cameron wrote: [..] > > > I'd drop the 'may assume' Also after this change it's not reserved. > > > 0 explicitly means transparent cache addressing. > > > > I am just going to switch the parenthetical to "(Unknown Address Mode)" > > because "transparent" does not give any actionable information about > > alias layout in the SRAT address space. So system-software can make no > > assumptions about layout without consulting implementation specific > > documentation. > > I'd like an option to indicate that we know reported errors will not > involve problems with aliases. Something like... > > 0 - Unknown (all bets are off, read the manual). > 1 - No aliases. > 2 - your one. > > A simple write-through or write-back cache would not result in aliases > for errors reported by the backing memory. This seems a separate proposal, and needs more discussion because there *are* aliases. While there is no HPA aliasing, there is a FRU (field-replaceable-unit) aliasing. So if system-software wants to determine what indicators to fire (i.e. replace cache-mem, replace backing-mem, or both) to the tech servicing the node it needs some ACPI help. I would be ok to do: 0 - Unknown (all bets are off, read the manual). 1 - Reserved 2 - Extended linear ...just to try to keep the list ordered by complexity for now. However, I am also worried about the case where folks want to do "noisy neighbor mitigation", which is something that has been attempted with PMEM caches. This involves knowing the layout of cache conflicts which need not be linear and involves reading the manual. So, I am not sure defining a "no aliases" indicator now improves the Extended Linear proposal, or is an improvement upon "read the manual". > Assuming we don't get an address corruption (in which case everything > dead anyway as uncontainable error), then poison can come from: > 1) poison happens in the memory itself (fine, the DPA in CXL is enough) > 2) poison happens in cache and is written back to memory. (fine > the DPA in CXL is enough). > 3) poison happens in cache and is read by host. Synchronous handling and > the HPA is available and enough. > > Not much we can do with 0, but 1 at least lets us know we have the > single right answer. That is, assuming that this is caching CXL. With CXL, the DPA information is available to disambiguate the source of the poison, but for memory-side-caches that are not backed by CXL, what does system-software do with that "1" case?