Re: [PATCH v2] firmware: fix sending -ERESTARTSYS due to signal on fallback

"Luis R. Rodriguez" <mcgrof@xxxxxxxxxx> · Tue, 6 Jun 2017 19:52:54 +0200

Used wrong alias for fsdevel now, its linux-fsdevel ...

  Luis

On Tue, Jun 06, 2017 at 06:34:01PM +0200, Luis R. Rodriguez wrote:
> Adding fsdevel for review on the correct semantics of handling signals on
> write(), in this case a sysfs write which triggered a sync request firmware
> call and what the firmware API should return in such case of a signal (I gather
> this should be -EINTR and not -ERESTARTSYS). Also whether or not SIGINT should
> be followed or if only allowing SIGKILL is fine (fine by me, but it would
> change old behaviour).
> 
> Hoping between fsdevel and linux-api folks we can hash this out.
> 
> On Tue, Jun 06, 2017 at 11:04:37AM +0200, Martin Fuzzey wrote:
> > On 05/06/17 22:24, Luis R. Rodriguez wrote:
> > > 
> > > 
> > > For these two reasons then it would seem best we do two things actually:
> > > 
> > > 1) return -EINTR instead of -EAGAIN when we detect swait_event_interruptible_timeout()
> > > got interrupted by a signal (it returns -ERESTARTSYS)
> > 
> > 
> > I disagree. That would force userspace to handle the signal rather than
> > having the kernel retry.
> > 
> > From Documentation/DocBook/kernel-hacking.tmpl:
> > 
> >    After you slept you should check if a signal occurred: the
> >    Unix/Linux way of handling signals is to temporarily exit the
> >    system call with the <constant>-ERESTARTSYS</constant> error.  The
> >    system call entry code will switch back to user context, process
> >    the signal handler and then your system call will be restarted
> >    (unless the user disabled that).  So you should be prepared to
> >    process the restart, e.g. if you're in the middle of manipulating
> >    some data structure.
> 
> This applies but you are missing my point that the LWN article [0] I referred
> to also stated "Kernel code which uses interruptible sleeps must always check
> to see whether it woke up as a result of a signal, and, if so, clean up
> whatever it was doing and return -EINTR back to user space." -- I realize there
> may be contradiction with above documentation -- this perhaps can be clarified
> with fsdevel folks *but* regardless of that the same article notes Alan Cox
> explains that "Unix tradition (and thus almost all applications) believe file
> store writes to be non signal interruptible. It would not be safe or practical
> to change that guarantee." So for this reason alone there does seem to be an
> exemption to the above documentation worth noting for file store writes, and
> the patch which you tested below *moves* the sysfs write op for firmware in
> that direction by adding a new killable swait.
> 
> [0] https://lwn.net/Articles/288056/                                                                                                                                          
>                                                                                                                                                                                 
> > > 2) Do as you note below and add wait_event_killable_timeout()
> > 
> > Hum,  I do think that would be better but, (please correct me if I'm wrong)
> > the _killable_ variants only allow SIGKILL  (and not SIGINT).
> 
> That seems correct given a TASK_KILLABLE is also TASK_UNINTERRUPTIBLE.
> 
> > 0cb64249ca "firmware_loader: abort request if wait_for_completion is
> > interrupted"
> > 
> > specifically mentrions ctrl-c (SIGINT) in the commit message so that would
> > no longer work.
> 
> Great point, but it *also* allowed SIGKILL, so I do feel the goal was also to
> allow it to be killable. I'm afraid that patch probably did not get proper
> review from sufficient folks and its worth now asking ourselves what we'd like
> to do.  I'm fine with letting go of SIGINT for firmware sysfs calls for the
> sake of keeping with the long standing unix tradition on write, given we *still
> have SIGKILL*.
> 
> > Myself I think having to use kill -9 to interrupt firmware loading by a
> > usespace helper is OK but others may disagree.
> 
> Its why I added fsdevel as well. This is really a semantics and uapi question.
> Between fsdevel and linux-api folks I would hope we can come to a sensible
> resolution.
> 
> > > I do not see why we could not introduce wait_event_killable_timeout()
> > > and swait_event_killable_timeout() into -stables.
> > > After seeing how simple it is to do so I tend to agree. Greg, Peter,
> > > what are your thoughts ?
> > > 
> > > Martin Fuzzey can you test this patch as an alternative to your issue ?
> > > 
> > > diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
> > > index b9f907eedbf7..70fc42e5e0da 100644
> > > --- a/drivers/base/firmware_class.c
> > > +++ b/drivers/base/firmware_class.c
> > > @@ -131,7 +131,7 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout)
> > >   {
> > >   	long ret;
> > > -	ret = swait_event_interruptible_timeout(fw_st->wq,
> > > +	ret = swait_event_killable_timeout(fw_st->wq,
> > >   				__fw_state_is_done(READ_ONCE(fw_st->status)),
> > >   				timeout);
> > >   	if (ret != 0 && fw_st->status == FW_STATUS_ABORTED)
> > > diff --git a/include/linux/swait.h b/include/linux/swait.h
> > > index c1f9c62a8a50..9c5ca2898b2f 100644
> > > --- a/include/linux/swait.h
> > > +++ b/include/linux/swait.h
> > > @@ -169,4 +169,29 @@ do {									\
> > >   	__ret;								\
> > >   })
> > > +#define __swait_event_killable(wq, condition)				\
> > > +	(void)___swait_event(wq, condition, TASK_KILLABLE, 0, schedule())
> > > +
> > > +#define swait_event_killable(wq, condition)				\
> > > +({									\
> > > +	int __ret = 0;							\
> > > +	if (!(condition))						\
> > > +		__ret = __swait_event_killable(wq, condition);		\
> > > +	__ret;								\
> > > +})
> > > +
> > > +#define __swait_event_killable_timeout(wq, condition, timeout)		\
> > > +	___swait_event(wq, ___wait_cond_timeout(condition),		\
> > > +		      TASK_INTERRUPTIBLE, timeout,			\
> > > +		      __ret = schedule_timeout(__ret))
> > > +
> > 
> > Should be TASK_KILLABLE above
> 
> Oops yes sorry.
> 
> > > +#define swait_event_killable_timeout(wq, condition, timeout)		\
> > > +({									\
> > > +	long __ret = timeout;						\
> > > +	if (!___wait_cond_timeout(condition))				\
> > > +		__ret = __swait_event_killable_timeout(wq,		\
> > > +						condition, timeout);	\
> > > +	__ret;								\
> > > +})
> > > +
> > >   #endif /* _LINUX_SWAIT_H */
> > > 
> > >    Luis
> > 
> > After replacing TASK_INTERRUPTIBLE with TASK_KILLABLE above it works for me.
> 
> Great, thanks for testing.
> 
>   Luis
> 

-- 
Luis Rodriguez, SUSE LINUX GmbH
Maxfeldstrasse 5; D-90409 Nuernberg