On Tue, Dec 19, 2023 at 11:15 AM Stefan Hoffmeister <stefan.hoffmeister@xxxxxxxxx> wrote: > > > Hi, > > vmwgfx implements drmPrimeFDToHandle in terms of the TTM resource manager. > > At the same time, the driver advertises > > .driver_features = > DRIVER_MODESET | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_GEM, > > Given that, a userland application will call drmCloseBufferHandle to > correctly release any previously acquired handle. > > In kernel language, this translates to ioctl > DRM_IOCTL_PRIME_FD_TO_HANDLE and DRM_IOCTL_GEM_CLOSE. > > Problems: > > a) because the application uses drmCloseBufferHandle, and that goes > through GEM (the driver claims "I am GEM"!), vmwgfx will always return > EINVAL, because the driver is _not_ GEM, but TTM on that aspect. > > This will always clobber errno for userspace, which is not at all > helpful for meaningful error handling. > > Additionally, the driver offers no formal API contract which would > allow applications to detect that the driver does not like be talked > to like a GEM driver, although it is _claims_ to be GEM driver. > > b) there is no (documented) means to release the handle acquired from > a call to drmPrimeFDToHandle; this implies that in principle any call > to drmPrimeFDToHandle will leak the handle. > > Questions: > > a) how can errno clobbering by vmwgfx with EINVAL be avoided? > > b) what is the correct way to release the TTM resources allocated > through drmPrimeFDToHandle? > > > For context: the KDE Plasma Desktop kwin DRM integration layer makes > extensive use of drmPrimeFDToHandle. With the way the VMware vmwgfx > driver behaves, I see these options: > > a) correctly check the return code of drmCloseBufferHandle and blindly > log all the EINVAL errors triggered by the VMware driver, putting > blame on the application for mis-managing handles > b) drop any return code from drmCloseBufferHandle onto the floor, > completely disregarding application integrity checking > c) hard-code a check for "vmwgfx" as the driver in use and spam "this > driver owned by VMware / Broadcom is broken, ignore the following > EINVAL and secondary effects" for integrity checks > > I do not like any of these options, to be honest. > > Many thanks for any input Thanks for the report! This is mainly an artifact of vmwgfx predating GEM and really basically all of drm, so the ioctl interface on top of which vmwgfx operates is a bit bonkers. I know there are a number of examples where drm graphics drivers were breaking the ioctl interface, but we've tried avoiding that in vmwgfx and so supporting our old userspace and new userspace, which are incompatible, is incredibly complex and fragile. Our approach was to, in general, deal with those as they come. Our userspace, also in general, deals with surfaces, which aren't GEM's, as you've noticed, but instead are backed by a GEM object, so expects the handle to prime to not be GEM's. As you can imagine it's a bit hard to make sure ioctl's for which userspace has the opposite expectations are working fine (we've done it for some already though, the "dumb" ones being an example). If you'd have an IGT test that tests for this without requiring CRC's then we'd get it fixed quickly. Otherwise making it work is difficult, not because this particular bug is difficult to fix, but because it's difficult to fix it while keeping the xorg drivers from 2012 working. We'll likely just make sure the prime ioctl's return proper GEM objects for clients that advertise certain cap's. Is KWin advertising DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT? That'd be a trivial, albeit unexpected, way of making sure the old userspace gets the semantics it expects, while making sure the new clients get proper behavior. z