Hi there, Currently having an issue with a Cuttlefish cluster w/ 3 OSDs and 1 MON. When trying to restart an OSD, the cluster became unresponsive to 'rbd export'. Here are some sample OSD logs: OSD we restarted -http://pastebin.com/UUuDdS1V Another OSD - http://pastebin.com/f12r4W2s In an attempt to get things back online, we tried restarting the entire cluster. We're now seeing these errors over all three OSDs: 2014-08-11 18:35:24.118737 7f9dbe3ed700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6808/12955 pipe(0x1d07a00 sd=139 :42246 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6808/1344 not 10.100.250.1:6808/12955 - wrong node! 2014-08-11 18:35:29.925865 7f9dc23fc700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6802/12408 pipe(0x1d07500 sd=140 :60606 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6802/5205 not 10.100.250.1:6802/12408 - wrong node! 2014-08-11 18:35:39.119564 7f9dbe3ed700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6808/12955 pipe(0x1d07a00 sd=139 :42253 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6808/1344 not 10.100.250.1:6808/12955 - wrong node! 2014-08-11 18:35:44.926511 7f9dc23fc700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6802/12408 pipe(0x1d07500 sd=140 :60613 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6802/5205 not 10.100.250.1:6802/12408 - wrong node! 2014-08-11 18:35:54.120391 7f9dbe3ed700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6808/12955 pipe(0x1d07a00 sd=139 :42259 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6808/1344 not 10.100.250.1:6808/12955 - wrong node! 2014-08-11 18:35:59.927252 7f9dc23fc700 0 -- 10.100.250.1:6806/31838 >> 10.100.250.1:6802/12408 pipe(0x1d07500 sd=140 :60619 s=1 pgs=0 cs=0 l=0).connect claims to be 10.100.250.1:6802/5205 not 10.100.250.1:6802/12408 - wrong node! ceph health: health HEALTH_WARN 6 pgs backfill; 6 pgs backfill_toofull; 3 pgs backfilling; 38 pgs degraded; 859 pgs stale; 859 pgs stuck stale; 47 pgs stuck unclean; recovery 60081/1241780 degraded (4.838%); 1 near full osd(s) monmap e18: 1 mons at {04=10.100.100.1:6789/0}, election epoch 1, quorum 0 04 osdmap e16752: 4 osds: 2 up, 2 in pgmap v7355946: 2515 pgs: 1647 active+clean, 6 active+remapped+wait_backfill+backfill_toofull, 821 stale+active+clean, 3 active+remapped+backfilling, 38 stale+active+degraded+remapped; 3421 GB data, 4855 GB used, 630 GB / 5485 GB avail; 60081/1241780 degraded (4.838%) mdsmap e1: 0/0/1 up -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140811/0c3b4a0e/attachment.htm>