Re: [PATCH uq/master 2/2] MCE, unpoison memory address across reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2011-01-17 03:08, Huang Ying wrote:
>>>> As indicated, I'm sitting on lots of fixes and refactorings of the MCE
>>>> user space code. How do you test your patches? Any suggestions how to do
>>>> this efficiently would be warmly welcome.
>>>
>>> We use a self-made test script to test.  Repository is at:
>>>
>>> git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git
>>>
>>> The kvm test script is in kvm sub-directory.
>>>
>>> The qemu patch attached is need by the test script.
>>>
>>
>> Yeah, I already found this yesterday and started reading. I was just
>> searching for p2v in qemu, but now it's clear where it comes from. Will
>> have a look (if you want to preview my changes:
>> git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream).
>>
>> I was almost about to use MADV_HWPOISON instead of the injection module.
>> Is there a way to recover the fake corruption afterward? I think that
>> would allow to move some of the test logic into qemu and avoid p2v which
>> - IIRC - was disliked upstream.
> 
> I don't know how to fully recover from  MADV_HWPOISON.  You can recover
> the virtual address space via qemu_ram_remap() introduced in 1/2 of this
> patchset.  But you will lose one or several physical pages for each
> testing.  I think that may be not a big issue for a testing machine.
> 
> Ccing Andi and Fengguang, they know more than me about MADV_HWPOISON.

"page-types -b hwpoison -x" does the trick of unpoisoning for me. It can
be found at linux/Documentation/vm/page-types.c. So it's quite easy to
set up and clean up a test case based on MADV_HWPOISON IMO. Not sure,
though, if that can simulate all of what you currently do via mce-inject.

> 
>> Also, is there a way to simulate corrected errors (BUS_MCEERR_AO)?
> 
> BUS_MCEERR_AO is recoverable uncorrected error instead of corrected
> error.
> 
> The test script is for BUS_MCEERR_AO and BUS_MCEERR_AR.  To see the
> effect of pure BUS_MCEERR_AO, just remove the memory accessing loop
> (memset) in tools/simple_process/simple_process.c.

Yeah, that question was based on lacking knowledge about the different
error types. Meanwhile, I was able to trigger BUS_MCEERR_AO via
MADV_HWPOISON - and also BUS_MCEERR_AR by accessing that page. However,
I did not succeed with using mce-inject so far, thus with mce-test. But
I need to check this again.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux