[RFC][PATCH] Add a sysctl option controlling kexec when MCE occurred

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"H. Peter Anvin" <hpa at zytor.com> writes:

> On 12/25/2010 09:19 AM, Eric W. Biederman wrote:
>>>
>>> So, kdump may receive wrong identifier when it starts after MCE 
>>> occurred, because MCE is reported by memory, cache, and TLB errors
>>>
>>> In the worst case, kdump will overwrite user data if it recognizes a 
>>> disk saving user data as a dump disk.
>> 
>> Absurdly unlikely there is a sha256 checksum verified over the
>> kdump kernel before it starts booting.  If you have very broken
>> memory it is possible, but absurdly unlikely that the machine will
>> even boot if you are having enough uncorrectable memory errors
>> an hour to get past the sha256 checksum and then be corruppt.
>> 
>
> That wouldn't be the likely scenario (passing a sha256 checksum with the
> wrong data due to a random event will never happen for all the computers
> on Earth before the Sun destroys the planet).  However, in a
> failing-memory scenario, the much more likely scenario is that kdump
> starts up, verifies the signature, and *then* has corruption causing it
> to write to the wrong disk or whatnot.  This is inherent in any scheme
> that allows writing to hard media after a failure (as opposed to, say,
> dumping to the network.)

Then kdump kernel should also panic if we detect an uncorrected ECC
error.  So this doesn't appear to open any new holes for disk corruption.

kexec on panic can also be used for taking crash dumps over the
network.  What happens with the data is totally defined by userspace
code in an initrd.

Which is why extra policy knobs should be where they can be used.

Eric



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux