On 2019/05/27, Thomas Hellstrom wrote: > On 5/27/19 2:35 PM, Emil Velikov wrote: > > Hi Thomas, > > > > On 2019/05/27, Thomas Hellstrom wrote: > > > > > > I think we might be talking past each other, let's take a step back: > > > > > > > > - as of previous patch, all of vmwgfx ioctls size is consistently > > > > handled by the core > > > I don't think I follow you here, AFAICT patch 3/5 only affects and > > > relaxes the execbuf checking (and in fact a little more than I would > > > like)? > > > > > Precisely, it makes execbuf ioctl behave like all other ioctls - both > > vmwgfx and rest of drm. > > But we're still enforcing a non-relaxed size check for the other vmwgfx > private ioctls, right? Which is relaxed, together with the directions, in > this commit? > Regardless of the patch, all !execbuf vmwgfx ioctls use the related size checking from core drm. > (Not that it matters much to the discussion, though). > Agreed. > > > > > > - handling of featue flags, as always, is responsibility of the > > > > driver > > > > ifself > > > > - with this patch, ioctl direction is also handled by core. > > > > > > > > Here core ensures we only copy in/out as much data as the kernel > > > > implementation can handle. > > > > > > > > > > > > Let's consider the following real world example - msm and virtio_gpu. > > > > > > > > An in field of an _IOW ioctl becomes in/out aka _IORW ioctl. > > > > - we add a flag to annotate/request the out, as always invalid flags > > > > are return -EINVAL > > > > - we change the ioctl encoding > > > > > > > > As currently handled by core DRM, old kernel/new userspace and > > > > vice-versa works just fine. Sadly, vmwgfx will error out, while it > > > > could > > > > be avoided. > > > IMO basically we have a tradeoff between strict checking in this case, > > > and new user-space vs old kernel "hazzle-free" transition in the > > > relaxed case. > > > > > Precisely. If I read the code correctly, ATM new userspace will fail > > against old kernels. Unless userspace writes two versions of the ioctl - > > with with each encoding. > > > > > > As said above, I'll gladly adjust core and/or others, if this relaxed > > > > approach causes an issue somewhere. A specific use-case, real or > > > > hypothetical will be appreciated. > > > To me there are two important reasons to keep the strict approach. > > > > > > 1) Avoid user-space mistakes early in the development cycle. We can't > > > distinguish between buggy user-space and "new" user-space. This is > > > important because of [a]) below. > > > > > Can you provide a concrete example, please? > > OK, let's say you were developing fence wait functionality. Like > vmw_fence_obj_wait ioctl. Then suddenly you started to wonder why the wait > never timed out as it should. The reason turn out to be that signals were > restarting the waits with the original timeout. So you change the ioctl from > W to RW and add a kernel-computed time to the argument. Everything is fine, > except that you forget to change this in a user-space application somewhere. > > So now what happens, is that that user-space bug can live on undetected as > in 1), and that means you can never go back and implement a stricter check > because that would completely break old user-space. > If I understand you correctly, the W -> RW change in unnecessary. Yet the only negative effect that I can see is the copy_to_user() overhead. The copy should be negligible, yet it "feels" silly. Is there anything more serious that I've missed? Having a closer look - vmwgfx (et al) seems to stand out, such that it does not provide a UABI define including the encoding. Hence it sort of duplicates that in userspace, by using the explicit drmCommand* Guess I could follow the suggestion in vmwgfx_drv.c move the defines to UABI, sync header and update mesa/xf86-video-vmwgfx. What do you think - yes, or please don't? > The current code will trap (and has historically trapped) code like this. > That's mainly why I'm reluctant to give it up, but I guess it can be > conditionally compiled in for debug purposes. > This piece here, is the holly grail. I'll go further and suggest: - add a strict encoding and size check, behind a config toggle - make it a core drm thing and drop the custom vmwgfx one Will keep it disabled by default - but will clearly document Kconfig and docs that devs should toggle it to catch bugs. > > > > > 2) Catch a lot of fuzzer combinations and error out early instead of > > > forwarding them to the ioctl function where they may cause harm. > > > > > Struggling to see why this is a problem? At some point the fuzzer will > > get past this first line of defence, so we want to make the rest of the > > ioctl is robust. > > > > > > > I think the new user-space vs old kernel can be handled nicely in user- > > > space with feature flags or API versions. That's the way we've handled > > > them up to now? > > > > > How is a feature flag doing to help if the encoding changes from _IOW > > to _IORW? > > Ah, you're referring to old user-space new kernel? Yes, I was probably > reading a bit too fast. Sorry about that. > > So we're basically landing in a tradeoff between trapping problems like > above, and hazzle-free ioctl argument definition change. > > OK, so I'm ok with that as long as there is a way we can compile in strict > checking, which will likely has to be as a vmwgfx-specific wrapper. > Ack, I'll proceed with the debug toggle suggestion. Thank you for the insightful input. Emil _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel