On 06/28/2017 01:15 PM, Paolo Bonzini wrote: > > > On 28/06/2017 12:41, Christian Borntraeger wrote: >> Martin, >> >> in preparation for KVM code to handle machine check (forward these into >> guests) here are two base patches that enhance the machine check handler >> to not crash the host when a "damage type" machine check happens while >> in SIE. This is a topic tag/branch on top of rc1. >> >> Paolo,Radim, FYI I will also merge this into my next branch to add KVM >> specific followup code for guest reinjection. > > Sounds good. Out of curiosity, I don't see any handling of > BUS_MCEERR_{AO,AR} in arch/s390, are you going to add that too? > > Paolo > This is probably more a question for Martin or Heiko, but our initial machine check handling seems to be older than the hwpoison infrastructure (older than the 2.6 git history). Historically we have killed processes with SIGSEGV on fatal errors and never included the BUS_MCEERR things so we still kill processes on errors. From an architectural point of view we can get the failing address, so maybe we should consider the BUS_MCEERR things for memory errors. On the other hand since z196 (2010) the memory is protected with RAIM (in addition to ECC) and I am not aware of any field incidence where HW was not able to recover since the introduction of RAIM the pressure to do that is pretty small. If we decide to do that, this would require additional changes for KVM - we would then need to translate the host address into a guest address or as V1 unset the valid bit for the failing address information. Christian -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html