Cannot start 5/20 OSDs

Matt Thompson <wateringcan@xxxxxxxxx> · Tue, 17 Sep 2013 22:05:42 +0100

Hi All,
I set up a new cluster today w/ 20 OSDs spanning 4 machines (journals not stored on separate disks), and a single MON running on a separate server (understand the single MON is not ideal for production environments).

The cluster had the default pools along w/ the ones created by radosgw.  There was next to no user data on the cluster with the exception of a few test files uploaded via swift client.

I ran the following on one node to increase replica size from 2 to 3:

for x in $(rados lspools); do ceph osd pool set $x size 3; done

After doing this, I noticed that 5 OSDs were down and repeatedly restarting them using the following brings them back online momentarily but then they go down / out again:

start ceph-osd id=X

Looking across the affected nodes, I'm seeing errors like this in the respective osd logs:

osd/ReplicatedPG.cc: 5405: FAILED assert(ssc)

 ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
 1: (ReplicatedPG::prep_push_to_replica(ObjectContext*, hobject_t const&, int, int, PushOp*)+0x8ea)
 [0x5fd50a]
 2: (ReplicatedPG::prep_object_replica_pushes(hobject_t const&, eversion_t, int, std::map<int, std:
:vector<PushOp, std::allocator<PushOp> >, std::less<int>, std::allocator<std::pair<int const, std::
vector<PushOp, std::allocator<PushOp> > > > >*)+0x722) [0x5fe552]
 3: (ReplicatedPG::recover_replicas(int, ThreadPool::TPHandle&)+0x657) [0x5ff487]
 4: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&)+0x736) [0x61d9c6]
 5: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x1b8) [0x6863e8]
 6: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x11) [0x6c5541]
 7: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b8df6]
 8: (ThreadPool::WorkThread::entry()+0x10) [0x8bac00]
 9: (()+0x7e9a) [0x7f610c09fe9a]
 10: (clone()+0x6d) [0x7f610a91dccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Have I done something foolish, or am I hitting a legitimate issue here?

On a side note, my cluster is now in the following state:

2013-09-17 20:47:13.651250 mon.0 [INF] pgmap v1536: 248 pgs: 243 active+clean, 2 active+recovery_wait, 3 active+recovering; 5497 bytes data, 866 MB used, 999 GB / 1000 GB avail; 21/255 degraded (8.235%); 7/85 unfound (8.235%)

According to a ceph health detail, the unfound are on the .users.uid and .rgw radosgw pools; I suppose I can remove those pools and have radosgw recreate them?  If this is not recoverable is it advisable to just format the cluster and start again?

Thanks in advance for the help.

Regards,
Matt
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com