Hello, On Wed, 17 Aug 2016 16:54:41 -0500 Dan Jakubiec wrote: > Hi Wido, > > Thank you for the response: > > > On Aug 17, 2016, at 16:25, Wido den Hollander <wido@xxxxxxxx> wrote: > > > > > >> Op 17 augustus 2016 om 17:44 schreef Dan Jakubiec <dan.jakubiec@xxxxxxxxx>: > >> > >> > >> Hello, we have a Ceph cluster with 8 OSD that recently lost power to all 8 machines. We've managed to recover the XFS filesystems on 7 of the machines, but the OSD service is only starting on 1 of them. > >> > >> The other 5 machines all have complaints similar to the following: > >> > >> 2016-08-17 09:32:15.549588 7fa2f4666800 -1 filestore(/var/lib/ceph/osd/ceph-1) Error initializing leveldb : Corruption: 6 missing files; e.g.: /var/lib/ceph/osd/ceph-1/current/omap/042421.ldb > >> That looks bad. And as Wido said, this shouldn't happen. What are your XFS mount options for that FS? I tend to remember seeing "nobarrier" in many OSD examples... > >> How can we repair the leveldb to allow the OSDs to startup? Hopefully somebody with a leveldb clue will pipe up, but I have grave doubts. > >> > > > > My first question would be: How did this happen? > > > > What hardware are you using underneath? Is there a RAID controller which is not flushing properly? Since this should not happen during a power failure. > > > > Each OSD drive is connected to an onboard hardware RAID controller and configured in RAID 0 mode as individual virtual disks. The RAID controller is an LSI 3108. > What are the configuration options? If there is no BBU and the controller is forcibly set to writeback caching, this would explain it, too. > I agree -- I am finding it bizarre that 7 of our 8 OSDs (one per machine) did not survive the power outage. > My philosophy on this is that if any of DCs we're in should suffer a total and abrupt power loss I won't care, as I'll be buried below tons of concrete (this being Tokyo). In a place were power outages are more likely, I'd put local APU in front of stuff and issues a remote shutdown from it when it starts to run out of juice. Having a HW/SW combo that can survive a sudden power loss is nice, having something in place that softly shuts down things before that is a lot better. > We did have some problems with the stock Ubunut xfs_repair (3.1.9) seg faulting, which eventually we overcame by building a newer version of xfs_repair (4.7.0). But it did finally repair clean. > That also doesn't instill me with confidence, both Ubuntu and XFS wise. > We actually have some different errors on other OSDs. A few of them are failing with "Missing map in load_pgs" errors. But generally speaking it appears to be missing files of various types causing different kinds of failures. > > I'm really nervous now about the OSD's inability to start with any inconsistencies and no repair utilities (that I can find). Any advice on how to recover? > What I've seen in the past assumes that you have at least a running cluster of sorts, just trashed PGs. This is far worse. Christian > > I don't know the answer to your question, but lost files are not good. > > > > You might find them in a lost+found directory if XFS repair worked? > > > > Sadly this directory is empty. > > -- Dan > > > Wido > > > >> Thanks, > >> > >> -- Dan J_______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com