On Tue, Sep 1, 2020 at 9:13 AM Daniel Vetter <daniel@xxxxxxxx> wrote: > > On Tue, Aug 18, 2020 at 04:37:51PM +0200, Thierry Reding wrote: > > On Fri, Aug 14, 2020 at 07:25:17PM +0200, Daniel Vetter wrote: > > > On Fri, Aug 14, 2020 at 7:17 PM Daniel Stone <daniel@xxxxxxxxxxxxx> wrote: > > > > > > > > Hi, > > > > > > > > On Fri, 14 Aug 2020 at 17:22, Thierry Reding <thierry.reding@xxxxxxxxx> wrote: > > > > > I suspect that the reason why this works in X but not in Wayland is > > > > > because X passes the right usage flags, whereas Weston may not. But I'll > > > > > have to investigate more in order to be sure. > > > > > > > > Weston allocates its own buffers for displaying the result of > > > > composition through GBM with USE_SCANOUT, which is definitely correct. > > > > > > > > Wayland clients (common to all compositors, in Mesa's > > > > src/egl/drivers/dri2/platform_wayland.c) allocate with USE_SHARED but > > > > _not_ USE_SCANOUT, which is correct in that they are guaranteed to be > > > > shared, but not guaranteed to be scanned out. The expectation is that > > > > non-scanout-compatible buffers would be rejected by gbm_bo_import if > > > > not drmModeAddFB2. > > > > > > > > One difference between Weston and all other compositors (GNOME Shell, > > > > KWin, Sway, etc) is that Weston uses KMS planes for composition when > > > > it can (i.e. when gbm_bo_import from dmabuf + drmModeAddFB2 from > > > > gbm_bo handle + atomic check succeed), but the other compositors only > > > > use the GPU. So if you have different assumptions about the layout of > > > > imported buffers between the GPU and KMS, that would explain a fair > > > > bit. > > > > > > Yeah non-modifiered multi-gpu (of any kind) is pretty much hopeless I > > > think. I guess the only option is if the tegra mesa driver forces > > > linear and an extra copy on everything that's USE_SHARED or > > > USE_SCANOUT. > > > > I ended up trying this, but this fails for the X case, unfortunately, > > because there doesn't seem to be a good synchronization point at which > > the de-tiling blit could be done. Weston and kmscube end up calling a > > gallium driver's ->flush_resource() implementation, but that never > > happens for X and glamor. > > > > But after looking into this some more, I don't think that's even the > > problem that we're facing here. The root of the problem that causes the > > glxgears crash that Karol was originally reporting is because we end up > > allocating the glxgears pixmaps using the dri3 loader from Mesa. But the > > dri3 loader will unconditionally pass both __DRI_IMAGE_USE_SHARE and > > __DRI_IMAGE_USE_SCANOUT, irrespective of whether the buffer will end up > > being scanned out directly or whether it will be composited onto the > > root window. > > > > What exactly happens depends on whether I run glxgears in fullscreen > > mode or windowed mode. In windowed mode, the glxgears buffers will be > > composited onto the root window, so there's no need for the buffers to > > be scanout-capable. If I modify the dri3 loader to not pass those flags > > I can make this work just fine. > > > > When I run glxgears in fullscreen mode, the modesetting driver ends up > > wanting to display the glxgears buffer directly on screen, without > > compositing it onto the root window. This ends up working if I leave out > > the _USE_SHARE and _USE_SCANOUT flags, but I notice that the kernel then > > complains about being unable to create a framebuffer, which in turn is > > caused by the fact that those buffers are not exported (the Tegra Mesa > > driver only exports/imports buffers that are meant for scanout, under > > the assumption that those are the only ones that will ever need to be > > used by KMS) and therefore Tegra DRM doesn't have a valid handle for > > them. > > > > So I think an ideal solution would probably be for glxgears to somehow > > pass better usage information when allocating buffers, but I suspect > > that that's just not possible, or would be way too much work and require > > additional protocol at the DRI level, so it's not really a good option > > when all we want to fix is backwards-compatibility with pre-modifiers > > userspace. > > > > Given that glamor also doesn't have any synchronization points, I don't > > see how I can implement the de-tiling blit reliably. I was wondering if > > it shouldn't be possible to flush the framebuffer resource (and perform > > the blit) at presentation time, but I couldn't find a good entry point > > to do this. > > > > One other solution that occurred to me was to reintroduce an old IOCTL > > that we used to have in the Tegra DRM driver. That IOCTL was meant to > > attach tiling meta data to an imported buffer and was basically a > > simplified, driver-specific way of doing framebuffer modifiers. That's > > a very ugly solution, but it would allow us to be backwards-compatible > > with pre-modifiers userspace and even use an optimal path for rendering > > and scanning out. The only prerequisite would be that the driver IOCTL > > was implemented and that a recent enough Mesa was used to make use of > > it. I don't like this very much because framebuffer modifiers are a much > > more generic solution, but all of the other options above are pretty > > much just as ugly. > > > > One other idea that I haven't explored yet is to be a little more clever > > about the export/import dance that we do for buffers. Currently we > > export/import at allocation time, and that seems to cause a bit of a > > problem, like the lack of valid GEM handles for some buffers (such as in > > the glxgears fullscreen use-case discussed above). I wonder if perhaps > > deferring the export/import dance until the handles are actually > > required may be a better way to do this. With such a solution, even if a > > buffer is allocated for scanout, it won't actually be imported/exported > > if the client ends up being composited onto the root window. Import and > > export would be limited to buffers that truly are going to be used for > > drmModeAddFB2(). I'll give that a shot and see if that gets me closer to > > my goal. > > (back from vacations) > > I think right thing to do is *shrug*, please use modifiers. They're meant > to solve these kind of problems for real, adding more hacks to paper over > userspace not using modifiers doesn't seem like a good idea. > > Wrt dri3, since we do client-side allocations and don't have modifiers, we > have to pessimistically assume we'll get scanned out. Modifiers and > relevant protocol is fixing this again, but for tegra where we essentially > can't get this right that leaves us in a very tough spot. > > So yeah I think "use modifiers" is the answer. > -Daniel Right.. the issue is just that we don't have any X release fixing it and some compositors (mutter) don't do the right thing by default either :/ I will ask around for mutter, but for X we really need to do a release I think, just I've heard about regressions we need to fix first. > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel