Re: [PATCH] kvm: Fix memory slot page alignment logic

Alexander Graf <agraf@xxxxxxx> · Mon, 10 Nov 2014 15:47:49 +0100

On 10.11.14 14:55, Igor Mammedov wrote:
> On Mon, 10 Nov 2014 14:16:58 +0100
> Alexander Graf <agraf@xxxxxxx> wrote:
> 
>>
>>
>> On 10.11.14 13:31, Igor Mammedov wrote:
>>> On Fri,  7 Nov 2014 22:18:45 +0100
>>> Alexander Graf <agraf@xxxxxxx> wrote:
>>>
>>>> Memory slots have to be page aligned to get entered into KVM. There
>>>> is existing logic that tries to ensure that we pad memory slots that
>>>> are not page aligned to the biggest region that would still fit in the
>>>> alignment requirements.
>>>>
>>>> Unfortunately, that logic is broken. It tries to calculate the start
>>>> offset based on the region size.
>>>>
>>>> Fix up the logic to do the thing it was intended to do and document it
>>>> properly in the comment above it.
>>>>
>>>> With this patch applied, I can successfully run an e500 guest with more
>>>> than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
>>>>
>>>> Cc: qemu-stable@xxxxxxxxxx
>>>> Signed-off-by: Alexander Graf <agraf@xxxxxxx>
>>>> ---
>>>>  kvm-all.c | 6 ++++--
>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kvm-all.c b/kvm-all.c
>>>> index 44a5e72..596e7ce 100644
>>>> --- a/kvm-all.c
>>>> +++ b/kvm-all.c
>>>> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>>>>      unsigned delta;
>>>>  
>>>>      /* kvm works in page size chunks, but the function may be called
>>>> -       with sub-page size and unaligned start address. */
>>>> -    delta = TARGET_PAGE_ALIGN(size) - size;
>>>> +       with sub-page size and unaligned start address. Pad the start
>>>> +       address to next and truncate size to previous page boundary. */
>>> I'm a bit confused how it works at all.
>>> Lets assume that there is no mapped pages that include start_addr,
>>> then if start_addr were padded to next page, kvm would map it from there
>>> but the rest of QEMU would still use unaligned start_addr for MemoryRegion
>>> that isn't even mapped.
>>
>> Sorry, I don't understand this paragraph. Memory slots in general are
>> accelerations for memory access - for MMIO (RAM is usually aligned), KVM
>> can always exit to QEMU and just do a manual MMIO exit.
>>
>>> It would seem that instead of padding up to the next page, start_addr
>>> should be moved to the start of the page that includes it to make page
>>> with original start_addr available to guest.
>>
>> No, because in that case you would map something as RAM that really
>> isn't RAM.
>>
>> Imagine you have the following memory layout:
>>
>> 0x1000 page size
>>
>> 1) 0x00000 - 0x10000 RAM
>> 2) 0x10000 - 0x10100 MMIO
>> 3) 0x10100 - 0x20000 RAM
>>
>> Then you want to map 1) as memory slot and 4) from 0x11000 onwards as
>> memory slot.
> so every access to RAM 0x10100-0x11000 which is not represented as memory
> slot would cause VMEXIT?

Yes, there's no other way. Otherwise we wouldn't be able to trap on the
exits from 0x10000 - 0x10100. Hardware only gives us page granularity.

Usually this isn't an issue because overlapping MMIO regions are pretty
large chunks of power-of-2 size - if you see any overlapping at all. On
e500 this bites us though, because we end up with small MSI-X windows
inside our address space (which in turn might also be a bug, but that
doesn't mean that the slot mapping logic should be left as broken as it is).

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html