On Wed, Jun 26, 2013 at 12:24 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > On 06/26/2013 01:18 AM, Gregory Farnum wrote: >> >> Some guesses are inline. >> >> On Tue, Jun 25, 2013 at 4:06 PM, Wido den Hollander <wido@xxxxxxxx> wrote: >>> >>> Hi, >>> >>> I'm not sure what happened, but on a Ceph cluster I noticed that the >>> monitors (running 0.61) started filling up the disks, so they were >>> restarted >>> with: >>> >>> mon compact on start = true >>> >>> After a restart the osdmap was empty, it showed: >>> >>> osdmap e2: 0 osds: 0 up, 0 in >>> pgmap v624077: 15296 pgs: 15296 stale+active+clean; 78104 MB data, >>> 243 >>> GB used, 66789 GB / 67032 GB avail >>> mdsmap e1: 0/0/1 up >>> >>> This cluster has 36 OSDs over 9 hosts, but suddenly that was all gone. >>> >>> I also checked the crushmap, all 36 OSDs were removed, no trace of them. >> >> >> As you guess, this is probably because the disks filled up. It >> shouldn't be able to happen but we found an edge case where leveldb >> falls apart; there's a fix for it in the repository now (asserting >> that we get back what we just wrote) that Sage can talk more about. >> Probably both disappeared because the monitor got nothing back when >> reading in the newest OSD Map, and so it's all empty. >> > > Sounds reasonable and logical. > > >>> "ceph auth list" still showed their keys though. >>> >>> Restarting the OSDs didn't help, since create-or-move complained that the >>> OSDs didn't exist and didn't do anything. I ran "ceph osd create" to get >>> the >>> 36 OSDs created again, but when the OSDs boot they never start working. >>> >>> The only thing they log is: >>> >>> 2013-06-26 01:00:08.852410 7f17f3f16700 0 -- 0.0.0.0:6801/4767 >> >>> 10.23.24.53:6801/1758 pipe(0x1025fc80 sd=116 :40516 s=1 pgs=0 cs=0 >>> l=0).fault with nothing to send, going to standby >> >> >> Are they going up and just sitting idle? This is probably because none >> of their peers are telling them to be responsible for any placement >> groups on startup. >> > > No, they never come up. So checking the monitor logs I only see the > create-or-move command changing their crush position, but they never mark > themselves as "up", so all the OSDs stay down. > > netstat however shows a connection with the monitor between the OSD and the > Mon, but nothing special in the logs at lower debugging. So the process is still running? Can you generate full logs with debug ms = 5, debug osd = 20, debug monc = 20? >>> The internet connection I'm behind is a 3G connection, so I can't go >>> skimming through the logs with debugging at very high levels, but I'm >>> just >>> wondering what this could be? >>> >>> It's obvious that the monitors filling up probably triggered the problem, >>> but I'm now looking at a way to get the OSDs back up again. >>> >>> In the meantime I upgraded all the nodes to 0.61.4, but that didn't >>> change >>> anything. >>> >>> Any ideas on what this might be and how to resolve it? >> >> >> At a guess, you can go in and grab the last good version of the OSD >> Map and inject that back into the cluster, then restart the OSDs? If >> that doesn't work then we'll need to figure out the right way to kick >> them into being responsible for their stuff. >> (First, make sure that when you turn them on they are actually >> connecting to the monitors.) > > > You mean grabbing the old OSDMap from an OSD or the Monitor store? Both are > using leveldb for their storage now, right? So I'd have to grab the OSD Map > using some leveldb tooling? There's a ceph-monstore-tool or similar that provides this functionality, although it's pretty new so you might need to grab an autobuilt package somewhere instead of the cuttlefish one (not sure) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com