Re: [PATCH] staging: greybus: operation: add generic timeout support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 23, 2017 at 03:39:43PM +0000, Bryan O'Donoghue wrote:
> On 23/01/17 15:13, Johan Hovold wrote:
> > On Mon, Jan 23, 2017 at 02:32:48PM +0000, Bryan O'Donoghue wrote:
> >> On 23/01/17 12:04, Johan Hovold wrote:
> >>
> >>> +static void gb_operation_timeout(unsigned long arg)
> >>> +{
> >>> +	struct gb_operation *operation = (void *)arg;
> >>> +
> >>> +	if (gb_operation_result_set(operation, -ETIMEDOUT)) {
> >>> +		/*
> >>> +		 * A stuck request message will be cancelled from the
> >>> +		 * workqueue.
> >>> +		 */
> >>> +		queue_work(gb_operation_completion_wq, &operation->work);
> >>> +	}
> >>> +}
> >>> +
> >>
> >> If queue_work fails, you will never wake up the waiter.
> > 
> > How could queue_work fail here? The operation result is always set
> > exactly once before being queued so this is fine.
> 
> Perhaps it relates to a bug elsewhere, though I struggle to see how we
> could feasibly re-use the work associated with an in-flight gb_message.

Yeah, and the check you added is even for a different struct
delayed_work, so I really don't think we have anything to worry about
here.

> Could you trap the result code and BUG_ON()/WARN_ON() for it so that we
> can debug this a little bit further ?

IIUC you've only seen this with some of your unfinished work during
development (and for a different work struct), so I don't think we need
to add a WARN_ON yet.

> >>> -
> >>> -	ret = wait_for_completion_interruptible_timeout(&operation->completion,
> >>> -							timeout_jiffies);
> >>> +	ret = wait_for_completion_interruptible(&operation->completion);
> >>
> >> Note, that's a case I explicitly handle (failure to queue the work) in
> >> the async set I sent out.
> > 
> > Yep, I noticed that, but that was for queueing your timeout work at
> > submission. And you queued unconditionally, so you could potentially
> > fail to queue if an operation was submitted twice, even if that would in
> > itself be a driver bug.
> 
> Have you run this through loopback without any # of in-flight operations
> constrained ?

No, not yet as that would require also your changes to the loopback
driver (the loopback driver still has its custom timeout implementation
after this patch).

I tested this using gbsim and synchronous operations as my Ara build
setup is currently broken.

Perhaps you can rerun the tests you saw the queue_work fail with? Are
you using gbsim or real hardware to test with?

Thanks,
Johan
_______________________________________________
greybus-dev mailing list
greybus-dev@xxxxxxxxxxxxxxxx
https://lists.linaro.org/mailman/listinfo/greybus-dev




[Index of Archives]     [Asterisk App Development]     [PJ SIP]     [Gnu Gatekeeper]     [IETF Sipping]     [Info Cyrus]     [ALSA User]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite News]     [Deep Creek Hot Springs]     [Yosemite Campsites]     [ISDN Cause Codes]     [Asterisk Books]

  Powered by Linux