On Tue, 25 Oct 2011 19:29:52 +0200 Michal Soltys <soltys@xxxxxxxx> wrote: > On 11-10-18 11:30, NeilBrown wrote: > > On Tue, 18 Oct 2011 10:16:41 +0100 "Orlowski, Lukasz" > > <lukasz.orlowski@xxxxxxxxx> wrote: > > > >> Hi, > >> > >> I was going through mdadm code and got to realize that r/w > >> operations are invoked without TEMP_FAILURE_RETRY() macro, which > >> protects from unexpected operation termination, case SIGINT is > >> thrown. According to my knowledge its POSIX best-practice to call > >> the r/w operations within that macro, lest some sporadic unexpected > >> behaviors occur. > >> > >> Any particular reason for not using it? > > > > I've never heard of TEMP_FAILURE_RETRY. > > > > And having looked in to it I would certainly try to avoid using it. > > > > As this grabbed my attention .. > > that macro is just a shortcut to something along the: > > do { > ret = read/write/etc.( ... ); > } while (ret < 0 && errno == EINTR); > > which has always been the proper way to handle such situations > (recollecting Stevens books, glibc reference manual, or any other solid > source). Why avoid using it ? Costs nothing, and guarantees we won't run > into some corner case. It is ugly and often unnecessary. Ugliness without virtue is a real cost. > > > If the SA_RESTART flag is set with sigaction() then it should be > > totally unnecessary. > > > > signals(7) has pretty large list of when it can or cannot happen, and > when it will always happen regardless of SA_RESTART. And it would be > quite different list when other unix vendors are considered (which > doesn't of course apply to mdadm case, it being only linux specific). > There're also not ignorable stop signals (and under some cases they will > end with EINTR as well). > > And it's not only SIGINT (as the original mail could suggest), any not > ignored signal can cause it. Yes, SA_RESTART isn't really a panacea. SIGSTOP cannot be ignored or blocked and can have the same effect. However this only affects system calls that can block (in an interruptible 'S' state, not a non-interruptible 'D' state), and then only if they cannot complete without returning a valid partial result. There are very few places where mdadm makes such a system calls. Some of the ioctl calls on md devices technically behave like this, but are very unlikely to block in practise and if they do then I probably want them to fail. The 'select' calls in msg.c probably should check for EINTR and try again, but in that case there is already an error check and a loop and I would just add if (rv < 0 && errno == EINTR) continue; rather than add the macro. So it is certainly worth auditing the code for places where EINTR might be returned (and being careful in the first place), but blindly applying TEMP_FAILURE_RETRY() is (in my opinion) wrong. In a well written program I would expect any place which might return EINTR to be a place which could also return other errors that suggest a retry is needed, and the EINTR checking should just be included with the other checking. NeilBrown
Attachment:
signature.asc
Description: PGP signature