Re: Various problems trying to vga-passthrough a Renoir iGPU to a xen/qubes-os hvm

Yann Dirson <ydirson@xxxxxxx> · Tue, 21 Dec 2021 23:12:42 +0100 (CET)

Alex wrote:
> 
> On Sun, Dec 19, 2021 at 11:41 AM Yann Dirson <ydirson@xxxxxxx> wrote:
> >
> > Christian wrote:
> > > Am 19.12.21 um 17:00 schrieb Yann Dirson:
> > > > Alex wrote:
> > > >> Thinking about this more, I think the problem might be related
> > > >> to
> > > >> CPU
> > > >> access to "VRAM".  APUs don't have dedicated VRAM, they use a
> > > >> reserved
> > > >> carve out region at the top of system memory.  For CPU access
> > > >> to
> > > >> this
> > > >> memory, we kmap the physical address of the carve out region
> > > >> of
> > > >> system
> > > >> memory.  You'll need to make sure that region is accessible to
> > > >> the
> > > >> guest.
> > > > So basically, the non-virt flow is is: (video?) BIOS reserves
> > > > memory, marks it
> > > > as reserved in e820, stores the physaddr somewhere, which the
> > > > GPU
> > > > driver gets.
> > > > Since I suppose this includes the framebuffer, this probably
> > > > has to
> > > > occur around
> > > > the moment the driver calls
> > > > drm_aperture_remove_conflicting_pci_framebuffers()
> > > > (which happens before this hw init step), right ?
> > >
> > > Well, that partially correct. The efifb is using the PCIe
> > > resources
> > > to
> > > access the framebuffer and as far as I know we use that one to
> > > kick
> > > it out.
> > >
> > > The stolen memory we get over e820/registers is separate to that.

How is the stolen memory communicated to the driver ?  That host physical
memory probably has to be mapped at the same guest physical address for
the magic to work, right ?

> > >
> > > > ... which brings me to a point that's been puzzling me for some
> > > > time, which is
> > > > that as the hw init fails, the efifb driver is still using the
> > > > framebuffer.
> > >
> > > No, it isn't. You are probably just still seeing the same screen.
> > >
> > > The issue is most likely that while efi was kicked out nobody
> > > re-programmed the display hardware to show something different.
> > >
> > > > Am I right in suspecting that efifb should get stripped of its
> > > > ownership of the
> > > > fb aperture first, and that if I don't get a black screen on
> > > > hw_init failure
> > > > that issue should be the first focus point ?
> > >
> > > You assumption with the black screen is incorrect. Since the
> > > hardware
> > > works independent even if you kick out efi you still have the
> > > same
> > > screen content, you just can't update it anymore.
> >
> > It's not only that the screen keeps its contents, it's that the
> > dom0
> > happily continues updating it.
> 
> If the hypevisor is using efifb, then yes that could be a problem as
> the hypervisor could be writing to the efifb resources which ends up
> writing to the same physical memory.  That applies to any GPU on a
> UEFI system.  You'll need to make sure efifb is not in use in the
> hypervisor.

That remark evokes several things to me.  First one is that every time
I've tried booting with efifb disabled in dom0, there was no visible
improvements in the guest driver - i.i. I really have to dig how vram mapping
is performed and check things are as expected anyway.

The other is that, when dom0 cannot use efifb, entering a luks key is
suddenly less user-friendly.  But in theory I'd think we could overcome
this by letting dom0 use efifb until ready to start the guest, a simple
driver unbind at the right moment should be expected to work, right ?
Going further and allowing the guest to use efifb on its own could
possibly be more tricky (starting with a different state?) but does
not seem to sound completely outlandish either - or does it ?

> 
> Alex
> 
> 
> >
> > > But putting efi asside what Alex pointed out pretty much breaks
> > > your
> > > neck trying to forward the device. You maybe could try to hack
> > > the
> > > driver to use the PCIe BAR for framebuffer access, but that might
> > > be
> > > quite a bit slower.
> > >
> > > Regards,
> > > Christian.
> > >
> > > >
> > > >> Alex
> > > >>
> > > >> On Mon, Dec 13, 2021 at 3:29 PM Alex Deucher
> > > >> <alexdeucher@xxxxxxxxx>
> > > >> wrote:
> > > >>> On Sun, Dec 12, 2021 at 5:19 PM Yann Dirson <ydirson@xxxxxxx>
> > > >>> wrote:
> > > >>>> Alex wrote:
> > > >>>>> On Mon, Dec 6, 2021 at 4:36 PM Yann Dirson
> > > >>>>> <ydirson@xxxxxxx>
> > > >>>>> wrote:
> > > >>>>>> Hi Alex,
> > > >>>>>>
> > > >>>>>>> We have not validated virtualization of our integrated
> > > >>>>>>> GPUs.  I
> > > >>>>>>> don't
> > > >>>>>>> know that it will work at all.  We had done a bit of
> > > >>>>>>> testing but
> > > >>>>>>> ran
> > > >>>>>>> into the same issues with the PSP, but never had a chance
> > > >>>>>>> to
> > > >>>>>>> debug
> > > >>>>>>> further because this feature is not productized.
> > > >>>>>> ...
> > > >>>>>>> You need a functional PSP to get the GPU driver up and
> > > >>>>>>> running.
> > > >>>>>> Ah, thanks for the hint :)
> > > >>>>>>
> > > >>>>>> I guess that if I want to have any chance to get the PSP
> > > >>>>>> working
> > > >>>>>> I'm
> > > >>>>>> going to need more details on it.  A quick search some
> > > >>>>>> time
> > > >>>>>> ago
> > > >>>>>> mostly
> > > >>>>>> brought reverse-engineering work, rather than official AMD
> > > >>>>>> doc.
> > > >>>>>>   Are
> > > >>>>>> there some AMD resources I missed ?
> > > >>>>> The driver code is pretty much it.
> > > >>>> Let's try to shed some more light on how things work, taking
> > > >>>> as
> > > >>>> excuse
> > > >>>> psp_v12_0_ring_create().
> > > >>>>
> > > >>>> First, register access through [RW]REG32_SOC15() is
> > > >>>> implemented
> > > >>>> in
> > > >>>> terms of __[RW]REG32_SOC15_RLC__(), which is basically a
> > > >>>> [RW]REG32(),
> > > >>>> except it has to be more complex in the SR-IOV case.
> > > >>>> Has the RLC anything to do with SR-IOV ?
> > > >>> When running the driver on a SR-IOV virtual function (VF),
> > > >>> some
> > > >>> registers are not available directly via the VF's MMIO
> > > >>> aperture
> > > >>> so
> > > >>> they need to go through the RLC.  For bare metal or
> > > >>> passthrough
> > > >>> this
> > > >>> is not relevant.
> > > >>>
> > > >>>> It accesses registers in the MMIO range of the MP0 IP, and
> > > >>>> the
> > > >>>> "MP0"
> > > >>>> name correlates highly with MMIO accesses in PSP-handling
> > > >>>> code.
> > > >>>> Is "MP0" another name for PSP (and "MP1" for SMU) ?  The MP0
> > > >>>> version
> > > >>> Yes.
> > > >>>
> > > >>>> reported at v11.0.3 by discovery seems to contradict the use
> > > >>>> of
> > > >>>> v12.0
> > > >>>> for RENOIR as set by soc15_set_ip_blocks(), or do I miss
> > > >>>> something ?
> > > >>> Typo in the ip discovery table on renoir.
> > > >>>
> > > >>>> More generally (and mostly out of curiosity while we're at
> > > >>>> it),
> > > >>>> do we
> > > >>>> have a way to match IPs listed at discovery time with the
> > > >>>> ones
> > > >>>> used
> > > >>>> in the driver ?
> > > >>> In general, barring typos, the code is shared at the major
> > > >>> version
> > > >>> level.  The actual code may or may not need changes to handle
> > > >>> minor
> > > >>> revision changes in an IP.  The driver maps the IP versions
> > > >>> from
> > > >>> the
> > > >>> ip discovery table to the code contained in the driver.
> > > >>>
> > > >>>> ---
> > > >>>>
> > > >>>> As for the register names, maybe we could have a short
> > > >>>> explanation of
> > > >>>> how they are structured ?  Eg. mmMP0_SMN_C2PMSG_69: that
> > > >>>> seems
> > > >>>> to
> > > >>>> be
> > > >>>> a MMIO register named "C2PMSG_69" in the "MP0" IP, but I'm
> > > >>>> not
> > > >>>> sure
> > > >>>> of the "SMN" part -- that could refer to the "System
> > > >>>> Management
> > > >>>> Network",
> > > >>>> described in [0] as an internal bus.  Are we accessing this
> > > >>>> register
> > > >>>> through this SMN ?
> > > >>> These registers are just mailboxes for the PSP firmware.  All
> > > >>> of
> > > >>> the
> > > >>> C2PMSG registers functionality is defined by the PSP
> > > >>> firmware.
> > > >>>   They
> > > >>> are basically scratch registers used to communicate between
> > > >>> the
> > > >>> driver
> > > >>> and the PSP firmware.
> > > >>>
> > > >>>>
> > > >>>>>   On APUs, the PSP is shared with
> > > >>>>> the CPU and the rest of the platform.  The GPU driver just
> > > >>>>> interacts
> > > >>>>> with it for a few specific tasks:
> > > >>>>> 1. Loading Trusted Applications (e.g., trusted firmware
> > > >>>>> applications
> > > >>>>> that run on the PSP for specific functionality, e.g., HDCP
> > > >>>>> and
> > > >>>>> content
> > > >>>>> protection, etc.)
> > > >>>>> 2. Validating and loading firmware for other engines on the
> > > >>>>> SoC.
> > > >>>>>   This
> > > >>>>> is required to use those engines.
> > > >>>> Trying to understand in more details how we start the PSP
> > > >>>> up, I
> > > >>>> noticed
> > > >>>> that psp_v12_0 has support for loading a sOS firmware, but
> > > >>>> never
> > > >>>> calls
> > > >>>> init_sos_microcode() - and anyway there is no sos firmware
> > > >>>> for
> > > >>>> renoir
> > > >>>> and green_sardine, which seem to be the only ASICs with this
> > > >>>> PSP
> > > >>>> version.
> > > >>>> Is it something that's just not been completely wired up yet
> > > >>>> ?
> > > >>> On APUs, the PSP is shared with the CPU so the PSP firmware
> > > >>> is
> > > >>> part
> > > >>> of
> > > >>> the sbios image.  The driver doesn't load it.  We only load
> > > >>> it on
> > > >>> dGPUs where the driver is responsible for the chip
> > > >>> initialization.
> > > >>>
> > > >>>> That also rings a bell, that we have nothing about Secure OS
> > > >>>> in
> > > >>>> the doc
> > > >>>> yet (not even the acronym in the glossary).
> > > >>>>
> > > >>>>
> > > >>>>> I'm not too familiar with the PSP's path to memory from the
> > > >>>>> GPU
> > > >>>>> perspective.  IIRC, most memory used by the PSP goes
> > > >>>>> through
> > > >>>>> carve
> > > >>>>> out
> > > >>>>> "vram" on APUs so it should work, but I would double check
> > > >>>>> if
> > > >>>>> there
> > > >>>>> are any system memory allocations that used to interact
> > > >>>>> with
> > > >>>>> the PSP
> > > >>>>> and see if changing them to vram helps.  It does work with
> > > >>>>> the
> > > >>>>> IOMMU
> > > >>>>> enabled on bare metal, so it should work in passthrough as
> > > >>>>> well
> > > >>>>> in
> > > >>>>> theory.
> > > >>>> I can see a single case in the PSP code where GTT is used
> > > >>>> instead
> > > >>>> of
> > > >>>> vram: to create fw_pri_bo when SR-IOV is not used (and there
> > > >>>> has
> > > >>>> to be a reason, since the SR-IOV code path does use vram).
> > > >>>> Changing it to vram does not make a difference, but then the
> > > >>>> only bo that seems to be used at that point is the one for
> > > >>>> the
> > > >>>> psp ring,
> > > >>>> which is allocated in vram, so I'm not too much surprised.
> > > >>>>
> > > >>>> Maybe I should double-check bo_create calls to hunt for more
> > > >>>> ?
> > > >>> We looked into this a bit ourselves and ran into the same
> > > >>> issues.
> > > >>> We'd probably need to debug this with the PSP team to make
> > > >>> further
> > > >>> progress, but this was not productized so neither team had
> > > >>> the
> > > >>> resources to delve further.
> > > >>>
> > > >>> Alex
> > > >>>
> > > >>>>
> > > >>>> [0]
> > > >>>> https://github.com/PSPReverse/psp-docs/blob/master/masterthesis-eichner-psp-2020.pdf
> > >
> > >
>