Re: OSD doesn't start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2012. July 4. 09:34:04 Gregory Farnum wrote:
> Hrm, it looks like the OSD data directory got a little busted somehow. How
> did you perform your upgrade? (That is, how did you kill your daemons, in
> what order, and when did you bring them back up.)

Since it would be hard and long to describe in text, I've collected the 
relevant log entries, sorted by time at http://pastebin.com/Ev3M4DQ9 . The 
short story is that after seeing that the OSDs won't start, I tried to bring 
down the whole cluster and start it up from scratch. It didn't change 
anything, so I rebooted the two machines (running all three daemons), to see 
if it changes anything. It didn't and I gave up.

My ceph config is available at http://pastebin.com/KKNjmiWM .

Since this is my test cluster, I'm not very concerned about the data on it. 
But the other one, with the same config, is dying I think. ceph-fuse is eating 
around 75% CPU on the sole monitor ("cc") node. The monitor about 15%. On the 
other two nodes, the OSD eats around 50%, the MDS 15%, the monitor another 
10%. No Ceph filesystem activity is going on at the moment. Blktrace reports 
about 1kB/s disk traffic on the partition hosting the OSD data dir. The data 
seems to be accessible at the moment, but I'm afraid that my production 
cluster will end up in a similar situation after upgrade, so I don't dare to 
touch it.

Do you have any suggestion what I should check?

Thanks,
-- 
cc

> On Wednesday, July 4, 2012 at 8:31 AM, Székelyi Szabolcs wrote:
> > Hi,
> > 
> > after upgrading to 0.48 "Argonaut", my OSDs won't start up again. This
> > problem might not be related to the upgrade, since the cluster had
> > strange behavior before, too: ceph-fuse was spinning the CPU around 70%,
> > so did the OSDs. This happened to both of my clusters. Thought that
> > upgrading might solve the problem, but it just got worse.
> > 
> > I've copied the log of the OSD run to http://pastebin.com/XYRtfFMU . I've
> > rebooted all the nodes, but they still don't work.
> > 
> > What should I do to resurrect my OSDs?
> > 
> > Thanks,
> > --
> > cc
> > 
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > (mailto:majordomo@xxxxxxxxxxxxxxx) More majordomo info at
> > http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux