Hi,
Could you please give me any hint about this issue & patch?
On 8/4/22 14:59, Eiichi Tsukata wrote:
Hi
We’ve also hit this case.
On May 5, 2022, at 9:32, zhenwei pi <pizhenwei@xxxxxxxxxxxxx> wrote:
Hi, Paolo
I would appreciate it if you could review patch.
On 4/20/22 14:45, zhenwei pi wrote:
qemu exits during reset with log:
qemu-system-x86_64: Could not remap addr: 1000@22001000
Currently, after MCE on RAM of a guest, qemu records a ram_addr only,
remaps this address with a fixed size(TARGET_PAGE_SIZE) during reset.
In the hugetlbfs scenario, mmap(addr...) needs page_size aligned
address and correct size. Unaligned address leads mmap to fail.
As far as I checked, SIGBUS sent from memory_failure() due to PR_MCE_KILL_EARLY has aligned address
in siginfo. But SIGBUS sent from kvm_mmu_page_fault() has unaligned address. This happens only when Guest touches
poisoned pages before they get remapped. This is not a usual case but it can sometimes happen.
FYI: call path
CPU 1/KVM-328915 [005] d..1. 711765.805910: signal_generate: sig=7 errno=0 code=4 comm=CPU 1/KVM pid=328915 grp=0 res=0
CPU 1/KVM-328915 [005] d..1. 711765.805915: <stack trace>
=> trace_event_raw_event_signal_generate
=> __send_signal
=> do_send_sig_info
=> send_sig_mceerr
=> handle_abnormal_pfn
=> direct_page_fault
=> kvm_mmu_page_fault
=> kvm_arch_vcpu_ioctl_run
=> kvm_vcpu_ioctl
=> __x64_sys_ioctl
=> do_syscall_64
In addition, aligning length suppresses the following madvise error message in qemu_ram_setup_dump():
qemu_madvise: Invalid argument
madvise doesn't support MADV_DONTDUMP, but dump_guest_core=off specified
Thanks
Eiichi
--
zhenwei pi