Hi On Wed, Mar 23, 2016 at 12:56 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > On Wed, Mar 23, 2016 at 12:30:42PM +0100, David Herrmann wrote: >> My question was rather about why we do this? Semantics for EINTR are >> well defined, and with SA_RESTART (default on linux) user-space can >> ignore it. However, looping on EAGAIN is very uncommon, and it is not >> at all clear why it is needed? >> >> Returning an error to user-space makes sense if user-space has a >> reason to react to it. I fail to see how EAGAIN on a cache-flush/sync >> operation helps user-space at all? As someone without insight into the >> driver implementation, it is hard to tell why.. Any hints? > > The reason we return EAGAIN is to workaround a deadlock we face when > blocking on the GPU holding the struct_mutex (inside the client's > process), but the GPU is dead. As our locking is very, very coarse we > cannot restart the GPU without acquiring the struct_mutex being held by > the client so we wake the client up and tell them the resource they are > waiting on (the flush of the object from the GPU into the CPU domain) is > temporarily unavailable. If they try to immediately wait upon the ioctl > again, they are blocked waiting for the reset to occur before they may > complete their flush. There are a few other possible deadlocks that are > also avoided with EAGAIN (again, the issue is more or less the lack of > fine grained locking). ...so you hijacked EAGAIN for all DRM ioctls just for a driver workaround? EAGAIN is universally used to signal the caller about a blocking resource. It is very much linked to O_NONBLOCK. Why not use EBUSY, ECANCELED, ECOMM, EDEADLOCK, EIO, EL3RST, ... Anyhow, I guess that ship has sailed. But just mentioning EAGAIN in a kernel-doc is way to vague for user-space to figure out they should loop on it. Thanks David -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html