Re: [PATCH v2 3/7] accel/kvm: Report the loss of a large memory page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/24 12:13, David Hildenbrand wrote:
On 07.11.24 11:21, “William Roche wrote:
From: William Roche <william.roche@xxxxxxxxxx>

When an entire large page is impacted by an error (hugetlbfs case),
report better the size and location of this large memory hole, so
give a warning message when this page is first hit:
Memory error: Loosing a large page (size: X) at QEMU addr Y and GUEST addr Z


Hm, I wonder if we really want to special-case hugetlb here.

Why not make the warning independent of the underlying page size?

We already have a warning provided by Qemu (in kvm_arch_on_sigbus_vcpu()):

Guest MCE Memory Error at QEMU addr Y and GUEST addr Z of type BUS_MCEERR_AR/_AO injected

The one I suggest is an additional message provided before the above message.

Here is an example:
qemu-system-x86_64: warning: Memory error: Loosing a large page (size: 2097152) at QEMU addr 0x7fdd7d400000 and GUEST addr 0x11600000 qemu-system-x86_64: warning: Guest MCE Memory Error at QEMU addr 0x7fdd7d400000 and GUEST addr 0x11600000 of type BUS_MCEERR_AO injected


According to me, this large page case additional message will help to better understand the probable sudden proliferation of memory errors that can be reported by Qemu on the impacted range. Not only will the machine administrator identify better that a single memory error had this large impact, it can also help us to better measure the impact of fixing the large page memory error support in the field (in the future).

These are some reasons why I do think this large page specific message can be useful.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux