On Mon, Mar 11, 2019 at 5:08 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Mon, Mar 11, 2019 at 8:37 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > Another feature the userspace tooling can support for the PMEM as RAM > > case is the ability to complete an Address Range Scrub of the range > > before it is added to the core-mm. I.e at least ensure that previously > > encountered poison is eliminated. > > Ok, so this at least makes sense as an argument to me. > > In the "PMEM as filesystem" part, the errors have long-term history, > while in "PMEM as RAM" the memory may be physically the same thing, > but it doesn't have the history and as such may not be prone to > long-term errors the same way. > > So that validly argues that yes, when used as RAM, the likelihood for > errors is much lower because they don't accumulate the same way. > > > The driver can also publish an > > attribute to indicate when rep; mov is recoverable, and gate the > > hotplug policy on the result. In my opinion a positive indicator of > > the cpu's ability to recover rep; mov exceptions is a gap that needs > > addressing. > > Is there some way to say "don't raise MC for this region"? Or at least > limit it to a nonfatal one? I wish, but no. The poison consumption always raises the MC then it's whether MCI_STATUS_PCC (processor context corrupt) is set as to whether the cpu indicates it is safe to proceed. There's no way to indicate, "never set MCI_STATUS_PCC", or silence the exception.