Re: [GIT PULL 0/2] s390,kvm: provide plumbing for machines checks when running guests

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Wed, 28 Jun 2017 14:48:52 +0200

On 28/06/2017 13:53, Christian Borntraeger wrote:
> This is probably more a question for Martin or Heiko, but our initial machine
> check handling seems to be older than the hwpoison infrastructure (older than
> the 2.6 git history). Historically we have killed processes with SIGSEGV on 
> fatal errors and never included the BUS_MCEERR things so we still kill processes
> on errors. From an architectural point of view we can get the failing address, 
> so maybe we should consider the BUS_MCEERR things for memory errors.

Also because other architectures use SIGBUS, and QEMU uses SIGBUS too.

> On the other hand since z196 (2010) the memory is protected with RAIM (in addition
> to ECC) and I am not aware of any field incidence where HW was not able to recover
> since the introduction of RAIM the pressure to do that is pretty small.
> 
> If we decide to do that, this would require additional changes for KVM - we would then
> need to translate the host address into a guest address or as V1 unset the valid
> bit for the failing address information.

That's fine.  Does s390 also do background scrubbing of memory?  That
would result in action-optional SIGBUS (siginfo->si_code ==
BUS_MCEERR_AO) to programs that request them with prctl; QEMU does, in
qemu_init_sigbus.  These forward the scrubbing results to the guest and
avoid a later action-required MCE.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-s390" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html