Hi Matthew, I am not quite sure about the POLLRDHUP. On the server side (ceph-mon), tcp_read_wait does see the POLLHUP - which should be the indicator that the the other side is shutting down. I have also taken a brief look at the client side (ceph mon stat). It initiates a shutdown - but never finishes. See attached log file from "ceph --log-file ceph-mon-stat.rsockets --debug-ms 30 mon stat". I have also attached the corresponding log file for regualr TCP/IP sockets. It looks to me that in the rsockets case, the reaper is able to cleanup even though there is still sth. left to do - and hence the shutdown never completes. Best Regards Andreas Bluemle On Mon, 12 Aug 2013 15:11:47 +0800 Matthew Anderson <manderson8787@xxxxxxxxx> wrote: > Hi Andreas, > > I think we're both working on the same thing, I've just changed the > function calls over to rsockets in the source instead of using the > pre-load library. It explains why we're having the exact same problem! > > From what I've been able to tell the entire problem revolves around > rsockets not supporting POLLRDHUP. As far as I can tell the pipe will > only be removed when tcp_read_wait returns -1. With rsockets it never > receives the POLLRDHUP event after shutdown_socket() is called so the > rpoll call blocks until timeout (900 seconds) and the pipe stays > active. > > The question then would be how can we destroy a pipe without relying > on POLLRDHUP? shutdown_socket() always gets called when the socket > should be closed so could there might be a way to trick > tcp_read_wait() into returning -1 by doing somethere in > shutdown_socket() but I'm not sure how to go about it. > > Any ideas? > > > > On Mon, Aug 12, 2013 at 1:55 PM, Andreas Bluemle < > andreas.bluemle@xxxxxxxxxxx> wrote: > > > Hi Matthew, > > > > > > On Fri, 9 Aug 2013 09:11:07 +0200 > > Matthew Anderson <manderson8787@xxxxxxxxx> wrote: > > > > > So I've had a chance to re-visit this since Bécholey Alexandre was > > > kind enough to let me know how to compile Ceph with the RDMACM > > > library (thankyou again!). > > > > > > At this stage it compiles and runs but there appears to be a > > > problem with calling rshutdown in Pipe as it seems to just wait > > > forever for the pipe to close which causes commands like 'ceph > > > osd tree' to hang indefinitely after they work successfully. > > > Debug MS is here - http://pastebin.com/WzMJNKZY > > > > > > > I am currently looking at a very similar problem. > > My test setup is to start ceph-mon monitors and check their state > > using "ceph mon stat". > > > > The monitors (3 instances) and the "ceph mon stat" command are > > started with LD_PRELOAD=<path to librspreload.so>. > > > > The behaviour is that the "ceph mon stat" command connects, sends > > the request and receives the answer, which shows a healthy state > > for the monitors. But the "ceph mon stat" does not terminate. > > > > On the monitor end I encounter an EOPNOTSUPP being set at the time > > the connection shall terminate. This is detected in the > > Pipe::tcp_read_wait() where the socket is poll'ed for IN and HUP > > events. > > > > What I have found out already is that it is not the poll() / rpoll() > > which set the error: they do return a HUP event and are happy. > > As far as I can tell, the fact of the EOPNOTSUPP being set is > > historical at that point, i.e. it must have been set at some > > earlier stage. > > > > I am using ceph v0.61.7. > > > > > > Best Regards > > > > Andreas > > > > > > > I also tried RADOS bench but it appears to be doing something > > > similar. Debug MS is here - http://pastebin.com/3aXbjzqS > > > > > > It seems like it's very close to working... I must be missing > > > something small that's causing some grief. You can see the OSD > > > coming up in the ceph monitor and the PG's all become > > > active+clean. When shutting down the monitor I get the below > > > which show's it waiting for the pipes to close - > > > > > > 2013-08-09 15:08:31.339394 7f4643cfd700 20 accepter.accepter > > > closing 2013-08-09 15:08:31.382075 7f4643cfd700 10 > > > accepter.accepter stopping 2013-08-09 15:08:31.382115 > > > 7f464bd397c0 20 -- 172.16.0.1:6789/0 wait: stopped accepter > > > thread 2013-08-09 15:08:31.382127 7f464bd397c0 20 -- > > > 172.16.0.1:6789/0 wait: stopping reaper thread 2013-08-09 > > > 15:08:31.382146 7f4645500700 10 -- 172.16.0.1:6789/0 reaper_entry > > > done 2013-08-09 15:08:31.382182 7f464bd397c0 20 -- > > > 172.16.0.1:6789/0 wait: stopped reaper thread 2013-08-09 > > > 15:08:31.382194 7f464bd397c0 10 -- 172.16.0.1:6789/0 wait: > > > closing pipes 2013-08-09 15:08:31.382200 7f464bd397c0 10 -- > > > 172.16.0.1:6789/0 reaper 2013-08-09 15:08:31.382205 7f464bd397c0 > > > 10 -- 172.16.0.1:6789/0 reaper done 2013-08-09 15:08:31.382210 > > > 7f464bd397c0 10 -- 172.16.0.1:6789/0 wait: waiting for pipes > > > 0x3014c80,0x3015180,0x3015400 to close > > > > > > The git repo has been updated if anyone has a few spare minutes to > > > take a look - https://github.com/funkBuild/ceph-rsockets > > > > > > Thanks again > > > -Matt > > > > > > > > > > > > > > > > > > On Thu, Jun 20, 2013 at 5:09 PM, Matthew Anderson > > > <manderson8787@xxxxxxxxx> wrote: Hi All, > > > > > > I've had a few conversations on IRC about getting RDMA support > > > into Ceph and thought I would give it a quick attempt to > > > hopefully spur some interest. What I would like to accomplish is > > > an RSockets only implementation so I'm able to use Ceph, RBD and > > > QEMU at full speed over an Infiniband fabric. > > > > > > What I've tried to do is port Pipe.cc and Acceptor.cc to rsockets > > > by replacing the regular socket calls with the rsocket equivalent. > > > Unfortunately it doesn't compile and I get an error of - > > > > > > CXXLD ceph-osd > > > ./.libs/libglobal.a(libcommon_la-Accepter.o): In function > > > `Accepter::stop()': > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:243: > > > undefined reference to > > > `rshutdown' > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:251: > > > undefined reference to > > > `rclose' ./.libs/libglobal.a(libcommon_la-Accepter.o): In function > > > `Accepter::entry()': > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:213: > > > undefined reference to > > > `raccept' > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:230: > > > undefined reference to > > > `rclose' ./.libs/libglobal.a(libcommon_la-Accepter.o): In function > > > `Accepter::bind(entity_addr_t const&, int, > > > int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:61: > > > undefined reference to > > > `rsocket' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:80: > > > undefined reference to > > > `rsetsockopt' > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:87: > > > undefined reference to > > > `rbind' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:118: > > > undefined reference to > > > `rgetsockname' > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:128: > > > undefined reference to > > > `rlisten' > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:100: > > > undefined reference to > > > `rbind' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Accepter.cc:87: > > > undefined reference to > > > `rbind' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::tcp_write(char const*, > > > int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2175: > > > undefined reference to > > > `rsend' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2162: > > > undefined reference to > > > `rshutdown' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::do_sendmsg(msghdr*, int, > > > bool)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:1867: > > > undefined reference to > > > `rsendmsg' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::tcp_read_nonblocking(char*, > > > int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2129: > > > undefined reference to > > > `rrecv' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::tcp_read(char*, > > > int)': /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:2079: > > > undefined reference to > > > `rshutdown' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::connect()': > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:768: > > > undefined reference to > > > `rclose' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:773: > > > undefined reference to > > > `rsocket' /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:781: > > > undefined reference to > > > `rconnect' ./.libs/libglobal.a(libcommon_la-Pipe.o): In function > > > `Pipe::writer()': > > /home/matt/Desktop/ceph-0.61.3-rsockets/src/msg/Pipe.cc:1471: > > > undefined reference to `rwrite' collect2: error: ld returned 1 > > > exit status make[3]: *** [ceph-mon] Error 1 > > > > > > > > > > > > From the looks of it I need to include the 'rdma/rsocket.h' > > > library somewhere else or add librdmacm but I'm not sure where. > > > > > > Full disclaimer, I am terrible at C++. If anyone has a few spare > > > minutes to have a look into the error messages and can point out > > > where I've gone wrong it would be greatly appreciated. > > > > > > I've put the code up at - > > > https://github.com/funkBuild/ceph-rsockets > > > > > > Thanks again > > > -Matt > > > > > > > > > > > > > > > > > -- > > Andreas Bluemle > > mailto:Andreas.Bluemle@xxxxxxxxxxx Heinrich Boell Strasse > > 88 Phone: (+49) 89 4317582 D-81829 Muenchen > > (Germany) Mobil: (+49) 177 522 0151 > > -- Andreas Bluemle mailto:Andreas.Bluemle@xxxxxxxxxxx Heinrich Boell Strasse 88 Phone: (+49) 89 4317582 D-81829 Muenchen (Germany) Mobil: (+49) 177 522 0151
Attachment:
ceph-mon-stat.rsockets
Description: Binary data
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com