Hi Thomas, On Mon, Dec 5, 2022 at 3:11 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote: > > Hi > > Am 05.12.22 um 10:32 schrieb mb@xxxxxxx: > > I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci > > to the 3090 and not the 3070. > > > > I have bound the driver in two different ways, first by passing the IDs > > to the module and alternatively by manipulating the system interface and > > use the override (this is what I originally had to do when I used two > > 1080s, so I know it works). > > > > While the 3090 doesn't show a console, there's a remnant from the refund > > (and grub previously) there. > > > > The assessment Alex made previously, where > > aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB) > > instead of the device seems correct, but it could also can be a quirky > > of how EFIFB is implemented. I recall reading a long time ago that EFIFB > > is a special device and once it detects changes it would simply give up. > > There was also no way to attach a device to it again as it depends on > > being preloaded outside the kernel; once something takes over the buffer > > reinitializing is "impossible". I never went deeper to try and > > understand it. > > We recently reworked fbdev's interaction with the aperture helpers. [1] > All devices should now be removed iff the driver has been bound to it > (which should be the case here) The patches went into an v6.1-rc. > > Could you try the most recent v6.1-rc and report if this fixes the problem? I just tried the latest one, v6.1-rc8, and I can see all the commits for the series you mentioned there. The same freeze behavior happens when I load vfio-pci: [ 6.525463] VFIO - User Level meta-driver version: 0.3 [ 6.528231] Console: switching to colour dummy device 320x90 -- Carlos > > Best regards > Thomas > > [1] https://patchwork.freedesktop.org/series/106040/ > > > > > > > On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <tzimmermann@xxxxxxx > > <mailto:tzimmermann@xxxxxxx>> wrote: > > > > Hi > > > > Am 05.12.22 um 01:51 schrieb Alex Williamson: > > > On Sat, 3 Dec 2022 17:12:38 -0700 > > > "mb@xxxxxxx" <mb@xxxxxxx> wrote: > > > > > >> Hi, > > >> > > >> I hope it is ok to reply to this old thread. > > > > > > It is, but the only relic of the thread is the subject. For > > reference, > > > the latest version of this posted is here: > > > > > > > > https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@xxxxxxx/ <https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@xxxxxxx/> > > > > > > Which is committed as: > > > > > > d17378062079 ("vfio/pci: Remove console drivers") > > > > > >> Unfortunately, I found a > > >> problem only now after upgrading to 6.0. > > >> > > >> My setup has multiple GPUs (2), and I depend on EFIFB to have a > > working console. > > > > Which GPUs do you have? > > > > >> pre-patch behavior, when I bind the vfio-pci to my secondary GPU > > both > > >> the passthrough and the EFIFB keep working fine. > > >> post-patch behavior, when I bind the vfio-pci to the secondary GPU, > > >> the EFIFB disappears from the system, binding the console to the > > >> "dummy console". > > > > The efifb would likely use the first GPU. And vfio-pci should only > > remove the generic driver from the second device. Are you sure that > > you're not somehow using the first GPU with vfio-pci. > > > > >> Whenever you try to access the terminal, you have the screen > > stuck in > > >> whatever was the last buffer content, which gives the impression of > > >> "freezing," but I can still type. > > >> Everything else works, including the passthrough. > > > > > > This sounds like the call to > > aperture_remove_conflicting_pci_devices() > > > is removing the conflicting driver itself rather than removing the > > > device from the driver. Is it not possible to unbind the GPU from > > > efifb before binding the GPU to vfio-pci to effectively nullify the > > > added call? > > > > > >> I can only think about a few options: > > >> > > >> - Is there a way to have EFIFB show up again? After all it looks > > like > > >> the kernel has just abandoned it, but the buffer is still there. I > > >> can't find a single message about the secondary card and EFIFB in > > >> dmesg, but there's a message for the primary card and EFIFB. > > >> - Can we have a boolean controlling the behavior of vfio-pci > > >> altogether or at least controlling the behavior of vfio-pci for that > > >> specific ID? I know there's already some option for vfio-pci and VGA > > >> cards, would it be appropriate to attach this behavior to that > > option? > > > > > > I suppose we could have an opt-out module option on vfio-pci to skip > > > the above call, but clearly it would be better if things worked by > > > default. We cannot make full use of GPUs with vfio-pci if they're > > > still in use by host console drivers. The intention was certainly to > > > unbind the device from any low level drivers rather than disable > > use of > > > a console driver entirely. DRM/GPU folks, is that possibly an > > > interface we could implement? Thanks, > > > > When vfio-pci gives the GPU device to the guest, which driver driver is > > bound to it? > > > > Best regards > > Thomas > > > > > > > > Alex > > > > > > > -- > > Thomas Zimmermann > > Graphics Driver Developer > > SUSE Software Solutions Germany GmbH > > Maxfeldstr. 5, 90409 Nürnberg, Germany > > (HRB 36809, AG Nürnberg) > > Geschäftsführer: Ivo Totev > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Ivo Totev