Re: cosd multi-second stalls cause "wrongly marked me down"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jim,

It's true that either one of these commits should do the trick, but I
would rather keep a41865e323 to make it explicit that we don't want
SIGPIPE. It's kind of a form of documentation in the code ("oh, I see
they're not using SIGPIPE"), and more documentation is a good thing.

The other issue is that we shouldn't alter the signal disposition of
library users. So it's prudent to put MSG_NOSIGNAL on all of our calls
to send() and sendmsg for consistency, in case we ever end up calling
send() from a library user's in a thread that we didn't create. (I
don't think we do currently, but we may in the future I guess.)

cheers,
Colin


On Thu, Mar 3, 2011 at 3:06 PM, Jim Schutt <jaschut@xxxxxxxxxx> wrote:
> Hi Colin,
>
> On Thu, 2011-03-03 at 14:53 -0700, Colin McCabe wrote:
>> Oh, SIGPIPE, my old nemesis. I should have guessed!
>>
>> I think it's time to block SIGPIPE everywhere... It's much better to
>> get EPIPE than to use a signal handler for this.
>
> I saw your commit d1fce13f9855.
>
> It seems like a41865e323 can be reverted, and
> everything should work correctly?
>
> -- Jim
>
>>
>> regards,
>> Colin
>>
>>
>> On Thu, Mar 3, 2011 at 12:55 PM, Yehuda Sadeh Weinraub
>> <yehudasa@xxxxxxxxx> wrote:
>> > On Thu, Mar 3, 2011 at 12:47 PM, Jim Schutt <jaschut@xxxxxxxxxx> wrote:
>> >> Has something maybe changed in signal handling recently?
>> >>
>> >> Maybe SIGPIPE used to be blocked, and sendmsg() would
>> >> return -EPIPE, but now it's not blocked and not handled?
>> >>
>> >> This bit in linux-2.6.git/net/core/stream.c is what made
>> >> me wonder, but maybe it's a red herring:
>> >>
>> >> int sk_stream_error(struct sock *sk, int flags, int err)
>> >> {
>> >>        if (err == -EPIPE)
>> >>                err = sock_error(sk) ? : -EPIPE;
>> >>        if (err == -EPIPE && !(flags & MSG_NOSIGNAL))
>> >>                send_sig(SIGPIPE, current, 0);
>> >>        return err;
>> >> }
>> >
>> > It was actually just changed at
>> > 35c4a9ffeadfe202b247c8e23719518a874f54e6, so if you're on latest
>> > master then it might be it. You can try reverting that commit, or can
>> > try this:
>> >
>> > index da22c7c..6f746d4 100644
>> > --- a/src/msg/SimpleMessenger.cc
>> > +++ b/src/msg/SimpleMessenger.cc
>> > @@ -1991,7 +1991,7 @@ int SimpleMessenger::Pipe::do_sendmsg(int sd,
>> > struct msghdr *msg, int len, bool
>> >       assert(l == len);
>> >     }
>> >
>> > -    int r = ::sendmsg(sd, msg, more ? MSG_MORE : 0);
>> > +    int r = ::sendmsg(sd, msg, MSG_NOSIGNAL | (more ? MSG_MORE : 0));
>> >     if (r == 0)
>> >       dout(10) << "do_sendmsg hmm do_sendmsg got r==0!" << dendl;
>> >     if (r < 0) {
>> >
>> >
>> > Yehuda
>> >>
>> >> -- Jim
>> >>
>> >>
>> >>
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >
>>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux