Yan Zhao <yan.y.zhao@xxxxxxxxx> writes: > Opt for devm_ioremap_wc() over devm_ioremap() when mapping the framebuffer. > > Using devm_ioremap() results in the VA being mapped with PAT=UC-, which > considerably slows down drm_fb_memcpy(). In contrast, devm_ioremap_wc() > maps the VA with PAT set to WC, leading to better performance on platforms > where access to UC memory is much slower than WC memory. > > Here's the performance data measured in a guest on the physical machine > "Sapphire Rapids XCC". > With host KVM honors guest PAT memory types, the effective memory type > for this framebuffer range is > - WC when devm_ioremap_wc() is used > - UC- when devm_ioremap() is used. > > The data presented is an average from 10 execution runs. > > Cycles: Avg cycles of executed bochs_primary_plane_helper_atomic_update() > from VM boot to GDM show up > Cnt: Avg cnt of executed bochs_primary_plane_helper_atomic_update() > from VM boot to GDM show up > T: Avg time of each bochs_primary_plane_helper_atomic_update(). > > ------------------------------------------------- > | | devm_ioremap() | devm_ioremap_wc() | > |------------|----------------|-------------------| > | Cycles | 211.545M | 0.157M | > |------------|----------------|-------------------| > | Cnt | 142 | 1917 | > |------------|----------------|-------------------| > | T | 0.1748s | 0.0004s | > ------------------------------------------------- > > Note: > Following the rebase to [3], the previously reported GDM failure on the > VGA device [1] can no longer be reproduced, thanks to the memory management > improvements made in [2]. Despite this, I have proceeded to submit this > patch because of the noticeable performance improvements it provides. > > Reported-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> FWIW, this patch (alone) resolves the observed issue, thanks! Tested-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> I, however, share Paolo's concern around existing VMs which KVM's change is effectively breaking. > Closes: https://lore.kernel.org/all/87jzfutmfc.fsf@xxxxxxxxxx/#t > Cc: Sean Christopherson <seanjc@xxxxxxxxxx> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Cc: Kevin Tian <kevin.tian@xxxxxxxxx> > Cc: Thomas Zimmermann <tzimmermann@xxxxxxx> > Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx> > Link: https://lore.kernel.org/all/87jzfutmfc.fsf@xxxxxxxxxx/#t [1] > Link: https://patchwork.freedesktop.org/series/138086 [2] > Link: https://gitlab.freedesktop.org/drm/misc/kernel/-/tree/drm-misc-next [3] > --- > v2: > - Rebased to the latest drm-misc-next branch. [2] > - Updated patch log to match the base code. > > v1: https://lore.kernel.org/all/20240909051529.26776-1-yan.y.zhao@xxxxxxxxx > --- > drivers/gpu/drm/tiny/bochs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c > index 69c5f65e9853..9055b1dd66df 100644 > --- a/drivers/gpu/drm/tiny/bochs.c > +++ b/drivers/gpu/drm/tiny/bochs.c > @@ -268,7 +268,7 @@ static int bochs_hw_init(struct bochs_device *bochs) > if (!devm_request_mem_region(&pdev->dev, addr, size, "bochs-drm")) > DRM_WARN("Cannot request framebuffer, boot fb still active?\n"); > > - bochs->fb_map = devm_ioremap(&pdev->dev, addr, size); > + bochs->fb_map = devm_ioremap_wc(&pdev->dev, addr, size); > if (bochs->fb_map == NULL) { > DRM_ERROR("Cannot map framebuffer\n"); > return -ENOMEM; -- Vitaly