Re: [ceph-users] Help needed porting Ceph to RSockets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Matthew,

I can confirm the beahviour whichi you describe.
I too believe that the problem is on the client side (ceph command).
My log files show the very same symptom, i.e. the client side
not being able to shutdown the pipes properly.

(Q: I had problems yesterday to send a mail to ceph-users list
with the log files attached to it because of the size of 
the attachments exceeding some limit; I hadnÄt been subscribed
to the list at that point. Is the uses of pastebin.com the better
way to provide such lengthy information in general?


Best Regards

Andreas Bluemle

On Tue, 13 Aug 2013 11:59:36 +0800
Matthew Anderson <manderson8787@xxxxxxxxx> wrote:

> Moving this conversation to ceph-devel where the dev's might be able
> to shed some light on this.
> 
> I've added some additional debug to my code to narrow the issue down
> a bit and the reader thread appears to be getting locked by
> tcp_read_wait() because rpoll never returns an event when the socket
> is shutdown. A hack way of proving this was to lower the timeout in
> rpoll to 5 seconds. When command like 'ceph osd tree' completes you
> can see it block for 5 seconds until rpoll times out and returns 0.
> The reader thread is then able to join and the pipe can be reaped.
> 
> Ceph log is here - http://pastebin.com/rHK4vYLZ
> Mon log is here - http://pastebin.com/WyAJEw0m
> 
> What's particularly weird is that the monitor receives a POLLHUP
> event when the ceph command shuts down it's socket but the ceph
> command never does. When using regular sockets both sides of the
> connection receive a POLLIN | POLLHUP | POLRDHUP event when the
> sockets are shut down. It would seem like there is a bug in rsockets
> that causes the side that calls shutdown first not to receive the
> correct rpoll events.
> 
> Can anyone comment on whether the above seems right?
> 
> Thanks all
> -Matt
> 
> 
> On Tue, Aug 13, 2013 at 12:06 AM, Andreas Bluemle <
> andreas.bluemle@xxxxxxxxxxx> wrote:
> 
> > Hi Matthew,
> >
> > I am not quite sure about the POLLRDHUP.
> > On the server side (ceph-mon), tcp_read_wait does see the
> > POLLHUP - which should be the indicator that the
> > the other side is shutting down.
> >
> > I have also taken a brief look at the client side (ceph mon stat).
> > It initiates a shutdown - but never finishes. See attached log file
> > from "ceph --log-file ceph-mon-stat.rsockets --debug-ms 30 mon
> > stat". I have also attached the corresponding log file for regualr
> > TCP/IP sockets.
> >
> > It looks to me that in the rsockets case, the reaper is able to
> > cleanup even though there is still sth. left to do - and hence the
> > shutdown never completes.
> >
> >
> > Best Regards
> >
> > Andreas Bluemle
> >
> >
> > On Mon, 12 Aug 2013 15:11:47 +0800
> > Matthew Anderson <manderson8787@xxxxxxxxx> wrote:
> >
> > > Hi Andreas,
> > >
> > > I think we're both working on the same thing, I've just changed
> > > the function calls over to rsockets in the source instead of
> > > using the pre-load library. It explains why we're having the
> > > exact same problem!
> > >
> > > From what I've been able to tell the entire problem revolves
> > > around rsockets not supporting POLLRDHUP. As far as I can tell
> > > the pipe will only be removed when tcp_read_wait returns -1. With
> > > rsockets it never receives the POLLRDHUP event after
> > > shutdown_socket() is called so the rpoll call blocks until
> > > timeout (900 seconds) and the pipe stays active.
> > >
> > > The question then would be how can we destroy a pipe without
> > > relying on POLLRDHUP? shutdown_socket() always gets called when
> > > the socket should be closed so could there might be a way to trick
> > > tcp_read_wait() into returning -1 by doing somethere in
> > > shutdown_socket() but I'm not sure how to go about it.
> > >
> > > Any ideas?
> > >
> >



-- 
Andreas Bluemle                     mailto:Andreas.Bluemle@xxxxxxxxxxx
Heinrich Boell Strasse 88           Phone: (+49) 89 4317582
D-81829 Muenchen (Germany)          Mobil: (+49) 177 522 0151
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux