I have 4 OSDs that won't stay in the cluster. I restart them, they join for a bit, then get kicked out because they stop responding to pings from the other OSDs. I don't know what the issue is. The disks look fine. SMART reports no errors or reallocated sectors. iostat says the disks are nearly idle when the OSD stops responding. dmesg says it's restarting the process, but doesn't say anything else interesting. kern.log doesn't say anything. I'm out of ideas, and I'm ready to gamble. So I have two ideas that might fix the issue. I can upgrade Emperor to Firefly. Or I can upgrade Ubuntu 12.04 (kernel 3.5.0-49-generic) to 14.04 (kernel 3.13.0-24-generic). If I upgrade to 14.04, I plan to hold Ceph on Emperor for the time being. My PG states: 1989 active+clean 17 active+remapped 12 down+peering 507 active+degraded 1 active+degraded+remapped+wait_backfill 28 stale+down+peering 2 active+recovering+degraded+remapped 1 down+remapped+peering 3 incomplete If I upgrade to Firefly, am I going to make things worse? Any opinions on which gamble is more likely to pay off? I plan to do both upgrades, but I want to do them one at a time unless necessary. I'm wondering which upgrade I should attempt first. -- *Craig Lewis* Senior Systems Engineer Office +1.714.602.1309 Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com> *Central Desktop. Work together in ways you never thought possible.* Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140514/c34108ff/attachment.htm>