> > At least on Intel you can only get a machine check for operation on poison data LOAD. > > Not for a STORE. I believe that is generally true - other arches to confirm. > > So what happens if you store to a poisoned cacheline on Intel? It'll > raise a poison consumption error when that cacheline is loaded in the > cache? Because you need to load that line into the cache for writing, > I'd presume... There are two places in the pipeline where poison is significant. 1) When the memory controller gets a request to fetch some data. If the ECC check on the bits returned from the DIMMs the memory controller will log a "UCNA" signature error to a machine check bank for the memory channel where the DIMMs live. If CMCI is enabled for that bank, then a CMCI is sent to all logical CPUs that are in the scope of that bank (generally a CPU socket). The data is marked with a POISON signature and passed to the entity that requested it. Caches support this POISON signature and preserve it as data is moved between caches, or written back to memory. This may have been a prefetch or a speculative read. In these cases there won't be a machine check. Linux uc_decode_notifier() will try to offline pages when it sees UCNA signatures. 2) When a CPU core tries to retire an instruction that consumes poison data, or needs to retire a poisoned instruction. These log an SRAR signature into a core scoped bank (on most Xeons to date bank 0 for poisoned instructions, bank 1 for poisoned data consumption). Then they signal a machine check. > What happens if you have bits flipped in the cacheline you want to write > to? > > That's fine because you're overwriting them anyway? > > I'd presume ECC check gets performed on cacheline load and then you'll > have to raise an #MC... Partial cacheline stores to data marked as POISON in the cache maintain the poison status. Full cacheline writes (certainly with MOVDIR64B instruction, possibly with some AVX512 instructions) can clear the POISON status (since you have all new data). A sequence of partial cache line stores that overwrite all data in a cache line will NOT clear the POISON status. Nothing is logged or signaled when updating data in the cache. -Tony