Re: [PATCH 5/7] memory: Register the RamDiscardManager instance upon guest_memfd creation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 1/24/2025 11:27 AM, Alexey Kardashevskiy wrote:
> 
> 
> On 21/1/25 00:06, David Hildenbrand wrote:
>> On 10.01.25 06:13, Chenyi Qiang wrote:
>>>
>>>
>>> On 1/9/2025 5:32 PM, Alexey Kardashevskiy wrote:
>>>>
>>>>
>>>> On 9/1/25 16:34, Chenyi Qiang wrote:
>>>>>
>>>>>
>>>>> On 1/8/2025 12:47 PM, Alexey Kardashevskiy wrote:
>>>>>> On 13/12/24 18:08, Chenyi Qiang wrote:
>>>>>>> Introduce the realize()/unrealize() callbacks to initialize/
>>>>>>> uninitialize
>>>>>>> the new guest_memfd_manager object and register/unregister it in the
>>>>>>> target MemoryRegion.
>>>>>>>
>>>>>>> Guest_memfd was initially set to shared until the commit bd3bcf6962
>>>>>>> ("kvm/memory: Make memory type private by default if it has guest
>>>>>>> memfd
>>>>>>> backend"). To align with this change, the default state in
>>>>>>> guest_memfd_manager is set to private. (The bitmap is cleared to 0).
>>>>>>> Additionally, setting the default to private can also reduce the
>>>>>>> overhead of mapping shared pages into IOMMU by VFIO during the
>>>>>>> bootup
>>>>>>> stage.
>>>>>>>
>>>>>>> Signed-off-by: Chenyi Qiang <chenyi.qiang@xxxxxxxxx>
>>>>>>> ---
>>>>>>>     include/sysemu/guest-memfd-manager.h | 27 +++++++++++++++++++
>>>>>>> ++++
>>>>>>> ++++
>>>>>>>     system/guest-memfd-manager.c         | 28 +++++++++++++++++++
>>>>>>> ++++
>>>>>>> ++++-
>>>>>>>     system/physmem.c                     |  7 +++++++
>>>>>>>     3 files changed, 61 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/include/sysemu/guest-memfd-manager.h b/include/sysemu/
>>>>>>> guest-memfd-manager.h
>>>>>>> index 9dc4e0346d..d1e7f698e8 100644
>>>>>>> --- a/include/sysemu/guest-memfd-manager.h
>>>>>>> +++ b/include/sysemu/guest-memfd-manager.h
>>>>>>> @@ -42,6 +42,8 @@ struct GuestMemfdManager {
>>>>>>>     struct GuestMemfdManagerClass {
>>>>>>>         ObjectClass parent_class;
>>>>>>>     +    void (*realize)(GuestMemfdManager *gmm, MemoryRegion *mr,
>>>>>>> uint64_t region_size);
>>>>>>> +    void (*unrealize)(GuestMemfdManager *gmm);
>>>>>>>         int (*state_change)(GuestMemfdManager *gmm, uint64_t offset,
>>>>>>> uint64_t size,
>>>>>>>                             bool shared_to_private);
>>>>>>>     };
>>>>>>> @@ -61,4 +63,29 @@ static inline int
>>>>>>> guest_memfd_manager_state_change(GuestMemfdManager *gmm, uint6
>>>>>>>         return 0;
>>>>>>>     }
>>>>>>>     +static inline void
>>>>>>> guest_memfd_manager_realize(GuestMemfdManager
>>>>>>> *gmm,
>>>>>>> +                                              MemoryRegion *mr,
>>>>>>> uint64_t region_size)
>>>>>>> +{
>>>>>>> +    GuestMemfdManagerClass *klass;
>>>>>>> +
>>>>>>> +    g_assert(gmm);
>>>>>>> +    klass = GUEST_MEMFD_MANAGER_GET_CLASS(gmm);
>>>>>>> +
>>>>>>> +    if (klass->realize) {
>>>>>>> +        klass->realize(gmm, mr, region_size);
>>>>>>
>>>>>> Ditch realize() hook and call guest_memfd_manager_realizefn()
>>>>>> directly?
>>>>>> Not clear why these new hooks are needed.
>>>>>
>>>>>>
>>>>>>> +    }
>>>>>>> +}
>>>>>>> +
>>>>>>> +static inline void guest_memfd_manager_unrealize(GuestMemfdManager
>>>>>>> *gmm)
>>>>>>> +{
>>>>>>> +    GuestMemfdManagerClass *klass;
>>>>>>> +
>>>>>>> +    g_assert(gmm);
>>>>>>> +    klass = GUEST_MEMFD_MANAGER_GET_CLASS(gmm);
>>>>>>> +
>>>>>>> +    if (klass->unrealize) {
>>>>>>> +        klass->unrealize(gmm);
>>>>>>> +    }
>>>>>>> +}
>>>>>>
>>>>>> guest_memfd_manager_unrealizefn()?
>>>>>
>>>>> Agree. Adding these wrappers seem unnecessary.
>>>>>
>>>>>>
>>>>>>
>>>>>>> +
>>>>>>>     #endif
>>>>>>> diff --git a/system/guest-memfd-manager.c b/system/guest-memfd-
>>>>>>> manager.c
>>>>>>> index 6601df5f3f..b6a32f0bfb 100644
>>>>>>> --- a/system/guest-memfd-manager.c
>>>>>>> +++ b/system/guest-memfd-manager.c
>>>>>>> @@ -366,6 +366,31 @@ static int
>>>>>>> guest_memfd_state_change(GuestMemfdManager *gmm, uint64_t offset,
>>>>>>>         return ret;
>>>>>>>     }
>>>>>>>     +static void guest_memfd_manager_realizefn(GuestMemfdManager
>>>>>>> *gmm,
>>>>>>> MemoryRegion *mr,
>>>>>>> +                                          uint64_t region_size)
>>>>>>> +{
>>>>>>> +    uint64_t bitmap_size;
>>>>>>> +
>>>>>>> +    gmm->block_size = qemu_real_host_page_size();
>>>>>>> +    bitmap_size = ROUND_UP(region_size, gmm->block_size) / gmm-
>>>>>>>> block_size;
>>>>>>
>>>>>> imho unaligned region_size should be an assert.
>>>>>
>>>>> There's no guarantee the region_size of the MemoryRegion is PAGE_SIZE
>>>>> aligned. So the ROUND_UP() is more appropriate.
>>>>
>>>> It is all about DMA so the smallest you can map is PAGE_SIZE so even if
>>>> you round up here, it is likely going to fail to DMA-map later anyway
>>>> (or not?).
>>>
>>> Checked the handling of VFIO, if the size is less than PAGE_SIZE, it
>>> will just return and won't do DMA-map.
>>>
>>> Here is a different thing. It tries to calculate the bitmap_size. The
>>> bitmap is used to track the private/shared status of the page. So if the
>>> size is less than PAGE_SIZE, we still use the one bit to track this
>>> small-size range.
>>>
>>>>
>>>>
>>>>>>> +
>>>>>>> +    gmm->mr = mr;
>>>>>>> +    gmm->bitmap_size = bitmap_size;
>>>>>>> +    gmm->bitmap = bitmap_new(bitmap_size);
>>>>>>> +
>>>>>>> +    memory_region_set_ram_discard_manager(gmm->mr,
>>>>>>> RAM_DISCARD_MANAGER(gmm));
>>>>>>> +}
>>>>>>
>>>>>> This belongs to 2/7.
>>>>>>
>>>>>>> +
>>>>>>> +static void guest_memfd_manager_unrealizefn(GuestMemfdManager *gmm)
>>>>>>> +{
>>>>>>> +    memory_region_set_ram_discard_manager(gmm->mr, NULL);
>>>>>>> +
>>>>>>> +    g_free(gmm->bitmap);
>>>>>>> +    gmm->bitmap = NULL;
>>>>>>> +    gmm->bitmap_size = 0;
>>>>>>> +    gmm->mr = NULL;
>>>>>>
>>>>>> @gmm is being destroyed here, why bother zeroing?
>>>>>
>>>>> OK, will remove it.
>>>>>
>>>>>>
>>>>>>> +}
>>>>>>> +
>>>>>>
>>>>>> This function belongs to 2/7.
>>>>>
>>>>> Will move both realizefn() and unrealizefn().
>>>>
>>>> Yes.
>>>>
>>>>
>>>>>>
>>>>>>>     static void guest_memfd_manager_init(Object *obj)
>>>>>>>     {
>>>>>>>         GuestMemfdManager *gmm = GUEST_MEMFD_MANAGER(obj);
>>>>>>> @@ -375,7 +400,6 @@ static void guest_memfd_manager_init(Object
>>>>>>> *obj)
>>>>>>>       static void guest_memfd_manager_finalize(Object *obj)
>>>>>>>     {
>>>>>>> -    g_free(GUEST_MEMFD_MANAGER(obj)->bitmap);
>>>>>>>     }
>>>>>>>       static void guest_memfd_manager_class_init(ObjectClass *oc,
>>>>>>> void
>>>>>>> *data)
>>>>>>> @@ -384,6 +408,8 @@ static void
>>>>>>> guest_memfd_manager_class_init(ObjectClass *oc, void *data)
>>>>>>>         RamDiscardManagerClass *rdmc =
>>>>>>> RAM_DISCARD_MANAGER_CLASS(oc);
>>>>>>>           gmmc->state_change = guest_memfd_state_change;
>>>>>>> +    gmmc->realize = guest_memfd_manager_realizefn;
>>>>>>> +    gmmc->unrealize = guest_memfd_manager_unrealizefn;
>>>>>>>           rdmc->get_min_granularity =
>>>>>>> guest_memfd_rdm_get_min_granularity;
>>>>>>>         rdmc->register_listener = guest_memfd_rdm_register_listener;
>>>>>>> diff --git a/system/physmem.c b/system/physmem.c
>>>>>>> index dc1db3a384..532182a6dd 100644
>>>>>>> --- a/system/physmem.c
>>>>>>> +++ b/system/physmem.c
>>>>>>> @@ -53,6 +53,7 @@
>>>>>>>     #include "sysemu/hostmem.h"
>>>>>>>     #include "sysemu/hw_accel.h"
>>>>>>>     #include "sysemu/xen-mapcache.h"
>>>>>>> +#include "sysemu/guest-memfd-manager.h"
>>>>>>>     #include "trace.h"
>>>>>>>       #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>>>>>>> @@ -1885,6 +1886,9 @@ static void ram_block_add(RAMBlock *new_block,
>>>>>>> Error **errp)
>>>>>>>                 qemu_mutex_unlock_ramlist();
>>>>>>>                 goto out_free;
>>>>>>>             }
>>>>>>> +
>>>>>>> +        GuestMemfdManager *gmm =
>>>>>>> GUEST_MEMFD_MANAGER(object_new(TYPE_GUEST_MEMFD_MANAGER));
>>>>>>> +        guest_memfd_manager_realize(gmm, new_block->mr, new_block-
>>>>>>>> mr->size);
>>>>>>
>>>>>> Wow. Quite invasive.
>>>>>
>>>>> Yeah... It creates a manager object no matter whether the user
>>>>> wants to
>>>>> us    e shared passthru or not. We assume some fields like private/
>>>>> shared
>>>>> bitmap may also be helpful in other scenario for future usage, and
>>>>> if no
>>>>> passthru device, the listener would just return, so it is acceptable.
>>>>
>>>> Explain these other scenarios in the commit log please as otherwise
>>>> making this an interface of HostMemoryBackendMemfd looks way cleaner.
>>>> Thanks,
>>>
>>> Thanks for the suggestion. Until now, I think making this an interface
>>> of HostMemoryBackend is cleaner. The potential future usage for
>>> non-HostMemoryBackend guest_memfd-backed memory region I can think of is
>>> the the TEE I/O for iommufd P2P support? when it tries to initialize RAM
>>> device memory region with the attribute of shared/private. But I think
>>> it would be a long term story and we are not sure what it will be like
>>> in future.
>>
>> As raised in #2, I'm don't think this belongs into HostMemoryBackend.
>> It kind-of belongs to the RAMBlock, but we could have another object
>> (similar to virtio-mem currently managing a single HostMemoryBackend-
>> >RAMBlock) that takes care of that for multiple memory backends.
> 
> The vBIOS thingy confused me and then I confused others :) There are 2
> things:
> 1) an interface or new subclass of HostMemoryBackendClass which we need
> to advertise and implement ability to discard pages;
> 2) RamDiscardManagerClass which is MR/Ramblock and does not really
> belong to HostMemoryBackend (as it is in what was posted ages ago).
> 
> I suggest Chenyi post a new version using the current approach with the
> comments and commitlogs fixed. Makes sense? Thanks,

Sure, thanks Alexey! BTW, I'm going to have a vacation. Will continue to
work on it after I come back :)

> 
> 





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux