On 23/03/17 07:32 PM, Thomas Hellstrom wrote: > On 03/23/2017 11:10 AM, Daniel Vetter wrote: >> On Thu, Mar 23, 2017 at 09:35:25AM +0100, Thomas Hellstrom wrote: >>> On 03/23/2017 08:31 AM, Daniel Vetter wrote: >>>> On Thu, Mar 23, 2017 at 08:28:32AM +0100, Daniel Vetter wrote: >>>>> On Thu, Mar 23, 2017 at 07:22:31AM +0100, Thomas Hellstrom wrote: >>>>>> On 03/22/2017 10:50 PM, Daniel Vetter wrote: >>>>>>> It's been around forever, no one bothered to address the FIXME, so I >>>>>>> presume it's all fine. >>>>>>> >>>>>>> Cc: Sinclair Yeh <syeh@xxxxxxxxxx> >>>>>>> Cc: Thomas Hellstrom <thellstrom@xxxxxxxxxx> >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> >>>>>> NAK. We need to properly address this. Probably as part of the atomic >>>>>> update. >>>>> So could someone with vmwgfx understanding explain this? Note that the >>>>> FIXME was originally added by me years ago, because I wasn't sure (only >>>>> about 90%) that this is safe, and was essentially pleading for a vmwgfx >>>>> expert to review this? >>>>> >>>>> Since it didn't happen I presume it's not that terribly and probably safe >>>>> ... >>>>> >>>>> I'm still 90% sure that this is correct, but I'd love for a vmwgfx to >>>>> audit it. Replying with a NAK is kinda not the response I was hoping for >>>>> (and yes I guess I should have explained what's going on here better, but >>>>> it's just a git blame of the FIXME comment away). >>> So the code has been left in place because it works. Altering it now >>> will create unnecessary merge conflicts with the atomic code, and the >>> change isn't tested and audited which means we need to drop focus from >>> what we're doing and audit and test code that isn't going to be used >>> anyway for not apparent reason? But otoh put in the below context there >>> indeed is a reason. >>> >>> From a quick audit of the existing code it seems like at least >>> vmw_cursor_update_position is touching global device state so I think at >>> a minimum we need to take a spinlock in that function. Otherwise it >>> seems to be safe. >> Note that you're holding the crtc lock already, which gives you exclusion >> against concurrent page_flips, mode_sets and property changes. Note also >> that page_flips themselves also only hold the crtc lock, so you can run >> multiple page_flips in parallel on different crtc (iirc vmwgfx has >> multiple crtc, if not this discussion is entirely moot). >> >> tbh I'd be surprised if my patch really breaks something that hasn't been >> a pre-existing issue for a long time. The original commit which added this >> FIXME comment is from 2012. Note also that because it's a hack, you >> already have a pretty a real race with the core drm state keeping, and no >> one seems to have hit that either. >> >> I mean I can dig through vmwgfx code and do the audit, but it'll take a >> few hours and vmwgfx is it's own world, so much harder to understand (for >> me). >> > > I'm thinking of the situation when someone would call a cursor_set ioctl > in parallell for two crtcs at the same time and race writing the > position registers? > Note that the device has only a single global cursor. > Admittedly the effects of a race would probably be small, but I'd rather > see it being properly protected. Indeed, as long as userspace uses cursor positions (and images) on each CRTC which are consistent with a single cursor in a single framebuffer, it shouldn't matter in which order they write the registers. And if the per-CRTC positions aren't consistent like that, locking won't help either. Strictly speaking, the (virtual) hardware is too limited to support the legacy KMS cursor API. AFAIR e.g. weston at least used to make use of HW cursors for other surfaces, not sure that's currently the case though. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel