I'm doing a test of Ceph in two colo facilities. Since it's just
a test, I only have 2 VMs running, one in each colo. Both VMs are
runing mon, mds, a single osd, and the RADOS gw. Cephx is
disabled. I'm testing if the latency between the two facilities
(~20ms) is low enough that I can run a single Ceph cluster in both
locations. If it doesn't work out, I'll run two independent Ceph
clusters with manual replication.
This weekend, the connection between the two locations was degraded. The link had 37% packet loss, for less than a minute. When the link returned to normal, the re-elected mon leader crashed. Is this a real bug, or did this happen because I'm only running 2 nodes? I'm trying to avoid bringing more nodes into this test. My VM infrastructure is pretty weak, and I'm afraid that more nodes would introduce more noise in the test. I saw this happen once before (the primary colo had a UPS failure, causing a switch reboot). The same process crashed, with the same stack trace. When that happened, I ran sudo service ceph restart on the machine with the crashed mon, and everything started up fine. I haven't restarted anything this time. I tried to recreate the problem by stopping and starting the VPN between the two locations, but that didn't trigger the crash. I have some more ideas on how to trigger, I'll continue trying today. arnulf@ceph0:~$ lsb_release -a Distributor ID: Ubuntu Description: Ubuntu 12.04.2 LTS Release: 12.04 Codename: precise arnulf@ceph0:~$ uname -a Linux ceph0 3.5.0-27-generic #46~precise1-Ubuntu SMP Tue Mar 26 19:33:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux arnulf@ceph0:~$ cat /etc/apt/sources.list.d/ceph.list deb http://ceph.com/debian-testing/ precise main arnulf@ceph0:~$ ceph -v ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c) ceph-mon.log from the non-elected master, mon.b: 2013-04-13 07:57:39.445098 7fde958f4700 0 mon.b@1(peon).data_health(20) update_stats avail 85% total 17295768 used 1679152 avail 14738024 2013-04-13 07:58:35.150603 7fde950f3700 0 log [INF] : mon.b calling new monitor election 2013-04-13 07:58:35.150876 7fde950f3700 1 mon.b@1(electing).elector(20) init, last seen epoch 20 2013-04-13 07:58:39.445355 7fde958f4700 0 mon.b@1(electing).data_health(20) update_stats avail 85% total 17295768 used 1679152 avail 14738024 2013-04-13 07:58:40.192514 7fde958f4700 1 mon.b@1(electing).elector(21) init, last seen epoch 21 2013-04-13 07:58:43.748907 7fde93dee700 0 -- 192.168.22.62:6789/0 >> 192.168.2.62:6789/0 pipe(0x2c56500 sd=25 :6789 s=2 pgs=108 cs=1 l=0).fault, initiating reconnect 2013-04-13 07:58:43.786209 7fde93ff0700 0 -- 192.168.22.62:6789/0 >> 192.168.2.62:6789/0 pipe(0x2c56500 sd=8 :6789 s=1 pgs=108 cs=2 l=0).fault 2013-04-13 07:59:13.050245 7fde958f4700 1 mon.b@1(probing) e1 discarding message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum 2013-04-13 07:59:13.050277 7fde958f4700 1 mon.b@1(probing) e1 discarding message auth(proto 0 34 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum 2013-04-13 07:59:13.050285 7fde958f4700 1 mon.b@1(probing) e1 discarding message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere; we are not in quorum ... ceph-mon.log from the elected master, mon.a: 2013-04-13 07:57:41.756844 7f162be82700 0 mon.a@0(leader).data_health(20) update_stats avail 84% total 17295768 used 1797312 avail 14619864 2013-04-13 07:58:35.210875 7f162b681700 0 log [INF] : mon.a calling new monitor election 2013-04-13 07:58:35.211081 7f162b681700 1 mon.a@0(electing).elector(20) init, last seen epoch 20 2013-04-13 07:58:40.270547 7f162be82700 1 mon.a@0(electing).elector(21) init, last seen epoch 21 2013-04-13 07:58:41.757032 7f162be82700 0 mon.a@0(electing).data_health(20) update_stats avail 84% total 17295768 used 1797312 avail 14619864 2013-04-13 07:58:43.441306 7f162b681700 0 log [INF] : mon.a@0 won leader election with quorum 0,1 2013-04-13 07:58:43.560319 7f162b681700 0 log [INF] : pgmap v1684: 632 pgs: 632 active+clean; 9982 bytes data, 2079 MB used, 100266 MB / 102346 MB avail; 0B/s rd, 0B/s wr, 0op/s 2013-04-13 07:58:43.561722 7f162b681700 -1 mon/PaxosService.cc: In function 'void PaxosService::propose_pending()' thread 7f162b681700 time 2013-04-13 07:58:43.560456 mon/PaxosService.cc: 127: FAILED assert(have_pending) ceph version 0.60 (f26f7a39021dbf440c28d6375222e21c94fe8e5c) 1: (PaxosService::propose_pending()+0x46d) [0x4dee3d] 2: (MDSMonitor::tick()+0x1c62) [0x51cdd2] 3: (MDSMonitor::on_active()+0x1a) [0x512ada] 4: (PaxosService::_active()+0x31d) [0x4e067d] 5: (Context::complete(int)+0xa) [0x4b7b4a] 6: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0x95) [0x4ba5a5] 7: (Paxos::handle_last(MMonPaxos*)+0xbef) [0x4da92f] 8: (Paxos::dispatch(PaxosServiceMessage*)+0x26b) [0x4dad8b] 9: (Monitor::_ms_dispatch(Message*)+0x149f) [0x4b310f] 10: (Monitor::ms_dispatch(Message*)+0x32) [0x4c9d12] 11: (DispatchQueue::entry()+0x341) [0x698da1] 12: (DispatchQueue::DispatchThread::entry()+0xd) [0x626c5d] 13: (()+0x7e9a) [0x7f1630c76e9a] 14: (clone()+0x6d) [0x7f162f911cbd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. I saved the log files from both events, in case you want the --- begin dump of recent events ---. This looks somewhat similiar to http://tracker.ceph.com/issues/3495. The first 6 lines of the stack trace are the same functions, but they diverge at line 7. -- |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com