On 23/01/17 15:13, Johan Hovold wrote: > On Mon, Jan 23, 2017 at 02:32:48PM +0000, Bryan O'Donoghue wrote: >> On 23/01/17 12:04, Johan Hovold wrote: >> >>> +static void gb_operation_timeout(unsigned long arg) >>> +{ >>> + struct gb_operation *operation = (void *)arg; >>> + >>> + if (gb_operation_result_set(operation, -ETIMEDOUT)) { >>> + /* >>> + * A stuck request message will be cancelled from the >>> + * workqueue. >>> + */ >>> + queue_work(gb_operation_completion_wq, &operation->work); >>> + } >>> +} >>> + >> >> If queue_work fails, you will never wake up the waiter. > > How could queue_work fail here? The operation result is always set > exactly once before being queued so this is fine. Perhaps it relates to a bug elsewhere, though I struggle to see how we could feasibly re-use the work associated with an in-flight gb_message. Could you trap the result code and BUG_ON()/WARN_ON() for it so that we can debug this a little bit further ? > >>> - >>> - ret = wait_for_completion_interruptible_timeout(&operation->completion, >>> - timeout_jiffies); >>> + ret = wait_for_completion_interruptible(&operation->completion); >> >> Note, that's a case I explicitly handle (failure to queue the work) in >> the async set I sent out. > > Yep, I noticed that, but that was for queueing your timeout work at > submission. And you queued unconditionally, so you could potentially > fail to queue if an operation was submitted twice, even if that would in > itself be a driver bug. Have you run this through loopback without any # of in-flight operations constrained ? _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel