RE: [PATCH V5] drm/i915: Disable stolen memory when i915 runs on qemu

"Zhang, Xiong Y" <xiong.y.zhang@xxxxxxxxx> · Thu, 13 Apr 2017 04:15:39 +0000

> + Kevin and David
> 
> On ke, 2017-04-12 at 20:20 +0800, Xiong Zhang wrote:
> > Stolen memory isn't a standard pci resource and exists in RMRR which has
> > identity mapping in iommu table, IGD could access stolen memory in host
> OS.
> > While according to 'commit c875d2c1b808 ("iommu/vt-d: Exclude devices
> using
> > RMRRs from IOMMU API domains")',RMRR isn't supported by kvm, then
> both EPT
> > and guest iommu domain table lack of maaping for stolen memory in kvm
> IGD
> > passthrough environment. If IGD access stolen memory in such environment,
> > many iommu exceptions exist in host dmesg and gpu hang exists also.
> > DMAR: [DMA Read] Request device [00:02.0] fault addr da012000
> > [fault reason 05] PTE Write access is not set
> > DMAR: [DMA Read] Request device [00:02.0] fault addr da2df000
> > [fault reason 06] PTE Read access is not set
> >
> > So stolen memory should be disabled in KVM IGD passthrough environment,
> > this patch detects such environment through the existence of qemu
> emulated
> > isa bridge.
> >
> > When the real ISA bridge is also passed through to guest, guest will have
> > two isa bridges: emulated and real. Qemu guarantees the busnum:devnum.
> > funcnum of emulated isa bridge is always less than the real one. Then
> > emulated isa bridge is always detected first by pci_get_class(ISA). So
> > stolen memory will be disabled in this case also.
> >
> > Stolen memory exists in kernel for a long time, but this patch depends
> > on INTEL_PCH_QEMU_DEVICE_ID_TYPE which was introduced in v4.5 kernel,
> > so this patch should be backported into v4.5 kernel and above.
> >
> > v2:GVT-g may run in non qemu (Zhenyu)
> > v3:Make commit message clear (Daniel)
> > v4:Fix typo
> > v5:Exclude P2X as it is used for VMware (Joonas)
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028
> >
> > Signed-off-by: Xiong Zhang <xiong.y.zhang@xxxxxxxxx>
> > Reviewed-by: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx>
> > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> 
> The commit message still fails to address the fact that the Bugzilla
> entry has a completely bogus bisect, the fact that there is a later
> commit that allows RMRRs on graphics devices;
[Zhang, Xiong Y] Indeed when I boot kernel 3.18, gpu hang don't
happen during boot process, but IOMMU DMA R/W to stolen memory
exception still exist in host dmesg.
When I boot kernel 3.19 and above, I see DMA R/W exception
in host dmesg and gpu hang. I'm lack of the knowledge to analyze the
gpu hang error. And I have updated the error into bugzilla, could you
help check whether this hang is caused by GT accessing to stolen memory
or not?

https://bugs.freedesktop.org/show_bug.cgi?id=99028
and
https://bugs.freedesktop.org/show_bug.cgi?id=99025
are the same issue which could be fixed by disable stolen memory.
But bisect result are different, so I think the first bad commit of 
git bisect isn't accurate.

> 
> commit 18436afdc11a00ac881990b454cfb2eae81d6003
> Author: David Woodhouse <David.Woodhouse@xxxxxxxxx>
> Date:   Wed Mar 25 15:05:47 2015 +0000
> 
>     iommu/vt-d: Allow RMRR on graphics devices too
> 
[Zhang, Xiong Y] 'commit c875d2c1b808 ("iommu/vt-d: Exclude devices
Using RMRRs from IOMMU API domains")', this commit prevent devices
associated with RMRR from passing through to guest.
'commit 18436afdc11a ("iommu/vt-d: Allow RMRR on graphics devices too")',
this commit add an exception for graphics device to above commit. So that
IGD could be assigned (pass through) to guest.

Hi, David:
   The following message exists in your 18436afdc11a commit message: 
    "Add an exclusion for graphics devices too, so that 'iommu=pt' works
    there. We should be able to successfully assign graphics devices to
    guests too, as long as the initial handling of stolen memory is
    reconfigured appropriately. This has certainly worked in the past."
What's the mean of "initial handling of stolen memory is reconfigured 
appropriately" ? we meet guest IGD accessing stolen memory issue.

> And the fact that GuC status is still not answered even I explicitly
> asked for it.
[Zhang, Xiong Y] GuC accessing to stolen memory bypass VT-d, Kevin has
confirmed this with VPG.

I add i915.enable_guc_loading=1 and i915.enable_guc_submission=1 option, and the dmesg demonstrate guc works when stolen memory is disabled.
[    5.265653] [drm:intel_uc_prepare_fw [i915]] fetch uC fw from i915/skl_guc_ver6_1.bin succeeded, fw ffff9167bc93c1e0
[    5.265668] [drm:intel_uc_prepare_fw [i915]] firmware version 6.1 OK (minimum 6.1)
[    5.265709] [drm:intel_uc_prepare_fw [i915]] uC fw fetch status SUCCESS, obj ffff91675ce88600
[    5.267493] [drm:intel_guc_init_hw [i915]] GuC fw status: path i915/skl_guc_ver6_1.bin, fetch SUCCESS, load NONE
[    5.267508] [drm:intel_guc_init_hw [i915]] GuC fw status: fetch SUCCESS, load PENDING
[    5.271571] [drm:guc_ucode_xfer_dma [i915]] DMA status 0x10, GuC status 0x8002f0ec
[    5.271586] [drm:guc_ucode_xfer_dma [i915]] returning 0
[    5.271587] [drm] GuC submission enabled (firmware i915/skl_guc_ver6_1.bin [version 6.1])
[    5.272232] [drm:i915_guc_submission_enable [i915]] reserved cacheline 0x0, next 0x40, linesize 64
[    5.272248] [drm:i915_guc_submission_enable [i915]] Host engines 0x17 => GuC engines used 0xf
[    5.272263] [drm:__reserve_doorbell [i915]] client 0 (high prio=no) reserved doorbell: 0
[    5.274352] [drm:i915_guc_submission_enable [i915]] new priority 2 client ffff91675cc56d80 for engine(s) 0x17: stage_id 0
[    5.274389] [drm:i915_guc_submission_enable [i915]] doorbell id 0, cacheline offset 0x0> 

> By my limited understanding of VT-d details: The stolen memory is never
> directly accessed by i915 driver (because CPU access doesn't work even
> in DOM0). It is only used through the aperture, which just requires for
> the GT device to have access to the RMRR. Further, the GT device needs
> to have access to stolen memory, because that's what GuC uses for
> backing storage for for WOPCM.
> 
> And even if after all of the above is addressed, shouldn't we rather
> try to detect the lack of RMRR, than presence of QEMU ISA?
[Zhang, Xiong Y] Good idea. Devices know I need RMRR, but on a native
machine, RMRR need bios support which allocate and reserve memory range
for RMRR. So RMRR need hypervisor's help in emulated environment. Only
hypervisor knows whether it support RMRR or not. In order to detect the lack
of RMRR in guest, i915 driver need to detect hypervisor. So in my last mail,
I try to use cupid(40000001) to detect hypervisor. Zhenyu think it is unacceptable
to use cupid(40000001) in a UPT(universal pass through) driver. 
> 
> What comes to my mind is exporting function like device_has_rmrr() from
> intel-iommu.com and consuming that, if we end up doing this. That way,
> if somebody, some day, goes and write RMRR pass-through code currently
> missing, it'll start working, just like it should.
[Zhang, Xiong Y] I also want to implement RMRR pass-through code at first.
But this solution is denied in my team's meeting. As kvm/qemu community
discussed this before and came to a solution:
https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf
> 
> Regards, Joonas
> --
> Joonas Lahtinen
> Open Source Technology Center
> Intel Corporation