On Sat, Nov 13, 2021 at 09:06:57AM -0800, Linus Torvalds wrote: > [ Hmm. This email was marked as spam for me. I see no obvious reason > for it being marked as spam, but it happens.. ] > > On Thu, Nov 11, 2021 at 12:52 PM Sudip Mukherjee > <sudipm.mukherjee@xxxxxxxxx> wrote: > > > > # first bad commit: [cd7f5ca33585918febe5e2f6dc090a21cfa775b0] > > drm/virtio: implement context init: add virtio_gpu_fence_event > > Hmm. Judging from your automated screenshots, the login never appears. Greg also reported a regression, plus I looked at the commit and had questions too. > > And, indeed reverting cd7f5ca33585 on top of debe436e77c7 has fixed > > the problem I was seeing on my qemu test of x86_64. The qemu image is > > based on Ubuntu. > > Presumably either that commit is somehow buggy in itself - or it does > exactly what it means to do, and the new poll() semantics just > confuses the heck out of the X server (or wayland or whatever). > > And honestly, if I read that thing correctly, the patch is entirely > broken. The new poll function (virtio_gpu_poll()) will unconditionally > remove the first event from the event list, and then report "Yeah, I > had events". > > This is completely bogus for a few reasons: > > - poll() really should be idempotent, because the poll function gets > called multiple times > > - it doesn't even seem to check that the event that it removes is the > new VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL kind of event, so it will > unconditionally just remove random events. > > - it does seem to check the "vfpriv->ring_idx_mask" and do the old > thing if that is zero, but I see absolutely no reason for that (and > that check itself has caused problems, see below) > > Honestly, my reaction to this all is that that commit is fundamentally > broken and probably should be reverted regardless as "this commit does > bad things". > > HOWEVER - it has had a fix for a NULL pointer dereference in the > meantime - can you check whether the current top of tree happens to > work for you? Maybe your problem isn't due to "that commit does > unnatural things", but simply due to the bug fixed in d89c0c8322ec > ("drm/virtio: Fix NULL dereference error in virtio_gpu_poll"). > > And if it's still broken with that commit, I'll happily revert it and > people need to go back to the drawing board. > > In fact, I would really suggest that people look at that > virtio_gpu_poll() function regardless. That odd "let's unconditionally > just drop events in the poll function is really REALLY broken > behavior. Tbh I haven't looked at the poll implementation, but it's fishy on principles as in drm drivers shouldn't reinvent them. The commit message cites vmwgfx as example for a private driver drm_event, but that works perfectly fine with standard drm_poll (and is meant to work perfectly fine with standard drm_poll). So if it's buggy on top of questionable I think revert might be the right choice irrespective of whether there's some fixes in-flight. So if you end up pushing that revert: References: https://lore.kernel.org/dri-devel/YZJrutLaiwozLfSw@phenom.ffwll.local/ Acked-by: Daniel Vetter <daniel.vetter@xxxxxxxx> Plus credit Greg too and all that ofc. But lets wait a bit for virtio folks to chime in, it's only Monday :-) Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization