On Wed, Mar 23, 2016 at 04:32:59PM +0100, David Herrmann wrote: > Hi > > On Wed, Mar 23, 2016 at 12:56 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > > On Wed, Mar 23, 2016 at 12:30:42PM +0100, David Herrmann wrote: > >> My question was rather about why we do this? Semantics for EINTR are > >> well defined, and with SA_RESTART (default on linux) user-space can > >> ignore it. However, looping on EAGAIN is very uncommon, and it is not > >> at all clear why it is needed? > >> > >> Returning an error to user-space makes sense if user-space has a > >> reason to react to it. I fail to see how EAGAIN on a cache-flush/sync > >> operation helps user-space at all? As someone without insight into the > >> driver implementation, it is hard to tell why.. Any hints? > > > > The reason we return EAGAIN is to workaround a deadlock we face when > > blocking on the GPU holding the struct_mutex (inside the client's > > process), but the GPU is dead. As our locking is very, very coarse we > > cannot restart the GPU without acquiring the struct_mutex being held by > > the client so we wake the client up and tell them the resource they are > > waiting on (the flush of the object from the GPU into the CPU domain) is > > temporarily unavailable. If they try to immediately wait upon the ioctl > > again, they are blocked waiting for the reset to occur before they may > > complete their flush. There are a few other possible deadlocks that are > > also avoided with EAGAIN (again, the issue is more or less the lack of > > fine grained locking). > > ...so you hijacked EAGAIN for all DRM ioctls just for a driver > workaround? No, we utilized the fact that EAGAIN was already enshrined by libdrm as the defacto mechanism for repeating the ioctl in order to repeat the ioctl for a driver workaround. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel