On Wed, Aug 12, 2020 at 7:03 PM James Jones <jajones@xxxxxxxxxx> wrote: > > On 8/12/20 5:37 AM, Ilia Mirkin wrote: > > On Wed, Aug 12, 2020 at 8:24 AM Karol Herbst <kherbst@xxxxxxxxxx> wrote: > >> > >> On Wed, Aug 12, 2020 at 12:43 PM Karol Herbst <kherbst@xxxxxxxxxx> wrote: > >>> > >>> On Wed, Aug 12, 2020 at 12:27 PM Karol Herbst <kherbst@xxxxxxxxxx> wrote: > >>>> > >>>> On Wed, Aug 12, 2020 at 2:19 AM James Jones <jajones@xxxxxxxxxx> wrote: > >>>>> > >>>>> Sorry for the slow reply here as well. I've been in the process of > >>>>> rebasing and reworking the userspace patches. I'm not clear my changes > >>>>> will address the Jetson Nano issue, but if you'd like to try them, the > >>>>> latest userspace changes are available here: > >>>>> > >>>>> https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3724 > >>>>> > >>>>> And the tegra-drm kernel patches are here: > >>>>> > >>>>> > >>>>> https://patchwork.ozlabs.org/project/linux-tegra/patch/20191217005205.2573-1-jajones@xxxxxxxxxx/ > >>>>> > >>>>> Those + the kernel changes addressed in this thread are everything I had > >>>>> outstanding. > >>>>> > >>>> > >>>> I don't know if that's caused by your changes or not, but now the > >>>> assert I hit is a different one pointing out that > >>>> nvc0_miptree_select_best_modifier fails in a certain case and returns > >>>> MOD_INVALID... anyway, it seems like with your patches applied it's > >>>> now way easier to debug and figure out what's going wrong, so maybe I > >>>> can figure it out now :) > >>>> > >>> > >>> collected some information which might help to track it down. > >>> > >>> src/gallium/frontends/dri/dri2.c:648 is the assert hit: assert(*zsbuf) > >>> > >>> templ is {reference = {count = 0}, width0 = 300, height0 = 300, depth0 > >>> = 1, array_size = 1, format = PIPE_FORMAT_Z24X8_UNORM, target = > >>> PIPE_TEXTURE_2D, last_level = 0, nr_samples = 0, nr_storage_samples = > >>> 0, usage = 0, bind = 1, flags = 0, next = 0x0, screen = 0x0} > >>> > >>> inside tegra_screen_resource_create modifier says > >>> DRM_FORMAT_MOD_INVALID as template->bind is 1 > >>> > >>> and nvc0_miptree_select_best_modifier returns DRM_FORMAT_MOD_INVALID, > >>> so the call just returns NULL leading to the assert. > >>> > >>> Btw, this is on Xorg-1.20.8-1.fc32.aarch64 with glxgears. > >>> > >> > >> So I digged a bit deeper and here is what tripps it of: > >> > >> when the context gets made current, the normal framebuffer validation > >> and render buffer allocation is done, but we end up inside > >> tegra_screen_resource_create at some point with PIPE_BIND_SCANOUT set > >> in template->bind. Now the tegra driver forces the > >> DRM_FORMAT_MOD_LINEAR modifier and calls into > >> resource_create_with_modifiers. > >> > >> If it wouldn't do that, nouveau would allocate a tiled buffer, with > >> that it's linear and we at some point end up with an assert about a > >> depth_stencil buffer being there even though it shouldn't. If I always > >> use DRM_FORMAT_MOD_INVALID in tegra_screen_resource_create, things > >> just work. > >> > >> That's kind of the cause I pinpointed the issue down to. But I have no > >> idea what's supposed to happen and what the actual bug is. > > > > Yeah, the bug with tegra has always been "trying to render to linear > > color + tiled depth", which the hardware plain doesn't support. (And > > linear depth isn't a thing.) > > > > Question is whether what it's doing necessary. PIPE_BIND_SCANOUT > > (/linear) requirements are needed for DRI2 to work (well, maybe not in > > theory, but at least in practice the nouveau ddx expects linear > > buffers). However tegra operates on a more DRI3-like basis, so with > > "client" allocations, tiled should work OK as long as there's > > something in tegra to copy it to linear when necessary? > > I can confirm the above: Our hardware can't render to linear depth > buffers, nor can it mix linear color buffers with block linear depth > buffers. > > I think there's a misunderstanding on expected behavior of > resource_create_with_modifiers() here too: > tegra_screen_resource_create() is passing DRM_FORMAT_MOD_INVALID as the > only modifier in non-scanout cases. Previously, I believe nouveau may > have treated that as "no modifiers specified. Fall back to internal > layout selection logic", but in my patches I "fixed" it to match other > drivers' behavior, in that allocation will fail if that is the only > modifier in the list, since it is equivalent to passing in a list > containing only unsupported modifiers. To get fallback behavior, > tegra_screen_resource_create() should pass in (NULL, 0) for (modifiers, > count), or just call resource_create() on the underlying screen instead. > > Beyond that, I can only offer my thoughts based on analysis of the code > referenced here so far: > > While I've learned from the origins of this thread applications/things > external to Mesa in general shouldn't be querying format modifiers of > buffers created without format modifiers, tegra is a Mesa internal > component that already has some intimate knowledge of how the nouveau > driver it sits on top of works. Nouveau will always be able to > construct and return a valid format modifier for unorm single sampled > color buffers (and hopefully, anything that can scan out going forward), > both before and after my patches I believe, regardless of how they were > allocated. After my patches, it should even work for things that can't > scan out in theory. Hence, looking at this without knowledge of what > motivated the original changes, it seems like > tegra_screen_resource_create should just naively forward the > resource_create() call, relying on nouveau to select a layout and > provide a valid modifier when queried for import. As Karol notes, this > works fine for at least this simple test case, and it's what nouveau > itself would be doing with an equivalent callstack, excepting the > modifier query, so I find it hard to believe it breaks some application > behavior. It'll also end up being equivalent (in end result, not quite > semantically) to what dri3_alloc_render_buffer() was doing prior to the > patch mentioned that broke things for Karol, so certainly for the DRI3 > usage it's the right behavior. > > Ilia, what in the nouveau DDX (As in Xfree86 DDX?) assumes linear > buffers? It sounds like you don't think it will interact poorly with > this path regardless? Thierry, do you recall what motivated the > force-linear code here? > > As to why this works for Thierry and not Karol, that's confusing. Are > you both using the same X11 DDX (modesetting I assume?) and X server > versions? Could it be a difference in client-side DRI library code somehow? > it's X. 1.20.99.1 works, 1.20.8 is broken. > Thanks, > -James > > > -ilia > > > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel