Re: [PATCH v2 0/3] support for broken memory modules (BadRAM)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/22/2011 01:30 PM, Stefan Assmann wrote:
> On 22.06.2011 20:15, H. Peter Anvin wrote:
>> On 06/22/2011 04:18 AM, Stefan Assmann wrote:
>>>
>>> The idea is to allow the user to specify RAM addresses that shouldn't be
>>> touched by the OS, because they are broken in some way. Not all machines have
>>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to
>>> use bitmasks to mask address patterns with the new "badram" kernel command line
>>> parameter.
>>> Memtest86 has an option to generate these patterns since v2.3 so the only thing
>>> for the user to do should be:
>>> - run Memtest86
>>> - note down the pattern
>>> - add badram=<pattern> to the kernel command line
>>>
>>
>> We already support the equivalent functionality with
>> memmap=<address>$<length> for those with only a few ranges... this has
>> been supported for ages, literally.  For those with a lot of ranges,
>> like Google, the command line is insufficient.
> 
> Right, I think this has been discussed a while ago. So the advantages I
> see in this approach are. It allows to break down memory exclusion to
> the page level with a pattern of non-consecutive pages. So if every
> other page would be considered bad that's a bit tough to deal with using
> memmap.
> Secondly patterns can be easily generated by running Memtest86 and thus
> easily be fed to the kernel by command line. Making it much more feasible
> for the average user to take advantage of it.
> 

How common are nontrivial patterns on real hardware?  This would be
interesting to hear from Google or another large user.

If so, we should probably introduce this as another linked-list data
structure; we can allow it to be preprocessed from the command line if
need be.

I have to say I think Google's point that truncating the list is
unacceptable... that would mean running in a known-bad configuration,
and even a hard crash would be better.

	-hpa

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]