Re: Ceph Very Small Cluster

Ranjan Ghosh <ghosh@xxxxxx> · Thu, 29 Sep 2016 12:08:17 +0200

Hi Vasu,

thank you for your answer.

Yes, all the pools have min_size 1:

root@uhu2 /scripts # ceph osd lspools
0 rbd,1 cephfs_data,2 cephfs_metadata,
root@uhu2 /scripts # ceph osd pool get cephfs_data min_size
min_size: 1
root@uhu2 /scripts # ceph osd pool get cephfs_metadata min_size
min_size: 1

I stopped all the ceph services gracefully on the first machine. But, 
just to get this straight: What if the first machine really suffered a 
catastrophic failure? My expectation was, that the second machine just 
keeps on running and serving files? This is why we are using a Cluster 
in the first place... Or is already this expectation wrong?

When I stop the services on node1, I get this:

# ceph pg stat
2016-09-29 11:51:09.514814 7fcba012f700  0 -- :/1939885874 >> 
136.243.82.227:6789/0 pipe(0x7fcb9c05a730 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x7fcb9c05c3f0).fault
v41732: 264 pgs: 264 active+clean; 18514 MB data, 144 GB used, 3546 GB / 
3690 GB avail; 1494 B/s rd, 0 op/s

So, my question still is: Is there a way to (preferably) automatically 
avoid such a situation? Or at least manually tell the second node to 
keep on working and forget about those files?

BR,
Ranjan

Am 28.09.2016 um 18:25 schrieb Vasu Kulkarni:

Are all the pools using min_size 1?  did you check pg stat and see which ones
are waiting? some steps to debug further and check
  http://docs.ceph.com/docs/jewel/rados/operations/monitoring-osd-pg/

Also did you shutdown the server abruptly while it was busy?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com