One scenario I can offer here as it relates to powercut/hard shutdown.
I had my data center get struck by lightning very early on in my Ceph lifespan when I was testing and evaluating.
I had 8 OSD’s on 8 hosts, and each OSD was a RAID0 (single) vd on my LSI RAID controller. On the RAID controller, I did not have a BBU. (mistake 1) On the disks, I was using on-disk cache (pdcache), as well as write back cache at the controller level. (mistake 2, mistake 3)
It was a learning experience, as it corrupted leveldb on 6/8 OSD’s, as the on-disk cache had partially written writes to persistent storage.
So moral of the story was to make sure pdcache is configured to off, if expecting power failures. $ sudo /opt/MegaRAID/storcli/storcli64 /c0 add vd type=raid0 drives=252:0 pdcache=off
And BBU’s would also increase likelihood of writes not going missing.
Reed
Hi, It really depends on type of power failure ... Normal poweroff of the cluster is fine ... I've been managing large cluster and we were forced to do total poweroff twice a year. It was working fine: we just safely unmounted all clients, then set noout flag and powered machines down. Powercut (hard shutdown) can be a big problem and I would expected problems here. Tom On 04-22 05:04, Santu Roy wrote: Hi
I am very new to Ceph. Studding for few days for a deployment of Ceph cluster. I am going to deploy ceph in a small data center where power failure is a big problem. we have single power supply, Single ups and a stand by generator. so what happend if all node down due to power failure? will it create any problem to restart service when power restore?
looking for your suggestion..
--
*Regards*Santu Roy
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxxhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|