Anything in dmesg? When you say restart, do you mean a physical restart, or just restarting the daemon? If it takes a physical restart and you're using intel NICs, it might be worth upgrading network drivers. Old versions have some bugs that cause them to just drop traffic. On 5/14/2014 9:06 PM, Craig Lewis wrote: > I have 4 OSDs that won't stay in the cluster. I restart them, they > join for a bit, then get kicked out because they stop responding to > pings from the other OSDs. > > I don't know what the issue is. The disks look fine. SMART reports > no errors or reallocated sectors. iostat says the disks are nearly > idle when the OSD stops responding. dmesg says it's restarting the > process, but doesn't say anything else interesting. kern.log doesn't > say anything. > > I'm out of ideas, and I'm ready to gamble. > > > So I have two ideas that might fix the issue. I can upgrade Emperor > to Firefly. Or I can upgrade Ubuntu 12.04 (kernel 3.5.0-49-generic) > to 14.04 (kernel 3.13.0-24-generic). If I upgrade to 14.04, I plan to > hold Ceph on Emperor for the time being. > > > > My PG states: > 1989 active+clean > 17 active+remapped > 12 down+peering > 507 active+degraded > 1 active+degraded+remapped+wait_backfill > 28 stale+down+peering > 2 active+recovering+degraded+remapped > 1 down+remapped+peering > 3 incomplete > > If I upgrade to Firefly, am I going to make things worse? > > Any opinions on which gamble is more likely to pay off? > > > I plan to do both upgrades, but I want to do them one at a time unless > necessary. I'm wondering which upgrade I should attempt first. > > > > > > > -- > > *Craig Lewis* > Senior Systems Engineer > Office +1.714.602.1309 > Email clewis at centraldesktop.com <mailto:clewis at centraldesktop.com> > > *Central Desktop. Work together in ways you never thought possible.* > Connect with us Website <http://www.centraldesktop.com/> | Twitter > <http://www.twitter.com/centraldesktop> | Facebook > <http://www.facebook.com/CentralDesktop> | LinkedIn > <http://www.linkedin.com/groups?gid=147417> | Blog > <http://cdblog.centraldesktop.com/> > > > > _______________________________________________ > ceph-users mailing list > ceph-users at lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140514/61caa811/attachment.htm>