Re: [PATCH v2 3/7] accel/kvm: Report the loss of a large memory page

David Hildenbrand <david@xxxxxxxxxx> · Tue, 12 Nov 2024 23:22:20 +0100

On 12.11.24 19:17, William Roche wrote:
On 11/12/24 12:13, David Hildenbrand wrote:
On 07.11.24 11:21, “William Roche wrote:
From: William Roche <william.roche@xxxxxxxxxx>

When an entire large page is impacted by an error (hugetlbfs case),
report better the size and location of this large memory hole, so
give a warning message when this page is first hit:
Memory error: Loosing a large page (size: X) at QEMU addr Y and GUEST
addr Z

Hm, I wonder if we really want to special-case hugetlb here.

Why not make the warning independent of the underlying page size?

We already have a warning provided by Qemu (in kvm_arch_on_sigbus_vcpu()):

Guest MCE Memory Error at QEMU addr Y and GUEST addr Z of type
BUS_MCEERR_AR/_AO injected

The one I suggest is an additional message provided before the above
message.

Here is an example:
qemu-system-x86_64: warning: Memory error: Loosing a large page (size:
2097152) at QEMU addr 0x7fdd7d400000 and GUEST addr 0x11600000
qemu-system-x86_64: warning: Guest MCE Memory Error at QEMU addr
0x7fdd7d400000 and GUEST addr 0x11600000 of type BUS_MCEERR_AO injected

Hm, I think we should definitely be including the size in the existing 
one. That code was written without huge pages in mind.

We should similarly warn in the arm implementation (where I don't see a 
similar message yet).

According to me, this large page case additional message will help to
better understand the probable sudden proliferation of memory errors
that can be reported by Qemu on the impacted range.
Not only will the machine administrator identify better that a single
memory error had this large impact, it can also help us to better
measure the impact of fixing the large page memory error support in the
field (in the future).

What about extending the existing one to something like

warning: Guest MCE Memory Error at QEMU addr $ADDR and GUEST $PADDR of 
type BUS_MCEERR_AO and size $SIZE (large page) injected

With the "large page" hint you can highlight that this is special.

On a related note ...I think we have a problem. Assume we got a SIGBUS 
on a huge page (e.g., somewhere in a 1 GiB page).

We will call kvm_mce_inject(cpu, paddr, code) / 
acpi_ghes_record_errors(ACPI_HEST_SRC_ID_SEA, paddr)

But where is the size information? :// Won't the VM simply assume that 
there was a MCE on a single 4k page starting at paddr?

I'm not sure if we can inject ranges, or if we would have to issue one 
MCE per page ... hm, what's your take on this?

--
Cheers,

David / dhildenb