Re: mdadm r/w operations without TEMP_FAILURE_RETRY()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 25 Oct 2011 19:29:52 +0200 Michal Soltys <soltys@xxxxxxxx> wrote:

> On 11-10-18 11:30, NeilBrown wrote:
> > On Tue, 18 Oct 2011 10:16:41 +0100 "Orlowski, Lukasz"
> > <lukasz.orlowski@xxxxxxxxx>  wrote:
> >
> >> Hi,
> >>
> >> I was going through mdadm code and got to realize that r/w
> >> operations are invoked without TEMP_FAILURE_RETRY() macro, which
> >> protects from unexpected operation termination, case SIGINT is
> >> thrown. According to my knowledge its POSIX best-practice to call
> >> the r/w operations within that macro, lest some sporadic unexpected
> >> behaviors occur.
> >>
> >> Any particular reason for not using it?
> >
> > I've never heard of TEMP_FAILURE_RETRY.
> >
> > And having looked in to it I would certainly try to avoid using it.
> >
> 
> As this grabbed my attention ..
> 
> that macro is just a shortcut to something along the:
> 
> do {
> 	ret = read/write/etc.( ... );
> } while (ret < 0 && errno == EINTR);
> 
> which has always been the proper way to handle such situations 
> (recollecting Stevens books, glibc reference manual, or any other solid 
> source). Why avoid using it ? Costs nothing, and guarantees we won't run 
> into some corner case.

It is ugly and often unnecessary.
Ugliness without virtue is a real cost.


> 
> > If the SA_RESTART flag is set with sigaction() then it should be
> > totally unnecessary.
> >
> 
> signals(7) has pretty large list of when it can or cannot happen, and
> when it will always happen regardless of SA_RESTART. And it would be 
> quite different list when other unix vendors are considered (which 
> doesn't of course apply to mdadm case, it being only linux specific). 
> There're also not ignorable stop signals (and under some cases they will 
> end with EINTR as well).
> 
> And it's not only SIGINT (as the original mail could suggest), any not 
> ignored signal can cause it.

Yes, SA_RESTART isn't really a panacea.  SIGSTOP cannot be ignored or blocked
and can have the same effect.

However this only affects system calls that can block (in an interruptible
'S' state, not a non-interruptible 'D' state), and then only if they cannot
complete without returning a valid partial result.

There are very few places where mdadm makes such a system calls.

Some of the ioctl calls on md devices technically behave like this, but are
very unlikely to block in practise and if they do then I probably want them
to fail.

The 'select' calls in msg.c probably should check for EINTR and try again,
but in that case there is already an error check and a loop and I would just
add
  if (rv < 0 && errno == EINTR)
	continue;

rather than add the macro.

So it is certainly worth auditing the code for places where EINTR might be
returned (and being careful in the first place), but blindly applying
TEMP_FAILURE_RETRY() is (in my opinion) wrong.
In a well written program I would expect any place which might return EINTR
to be a place which could also return other errors that suggest a retry is
needed, and the EINTR checking should  just be included with the other
checking.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux