Re: One OSD always dieing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I now did an upgrade to dumpling (ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1)), but the osd still fails at startup with a trace.

 

Heres the trace:

 

http://paste.ubuntu.com/6755307/

 

If you need any more infos I will provide them. Can someone please help?

 

Thanks

 

Von: ceph-users-bounces@xxxxxxxxxxxxxx [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] Im Auftrag von Rottmann, Jonas (centron GmbH)
Gesendet: Montag, 30. Dezember 2013 09:30
An: 'Andrei Mikhailovsky'
Cc: ceph-users@xxxxxxxx
Betreff: Re: [ceph-users] One OSD always dieing

 

Hi Andrei,

 

It is the first time I’m running into this. How to fix it? Upgrading with an not fully healthy cluster seams to be not so an great idea.

 

After fixing it I will perform the upgrad ASAP.

 

Thanks for your help so far.

 

Von: Andrei Mikhailovsky [mailto:andrei@xxxxxxxxxx]
Gesendet: Sonntag, 29. Dezember 2013 09:40
An: Rottmann, Jonas (centron GmbH)
Cc: ceph-users@xxxxxxxx
Betreff: Re: [ceph-users] One OSD always dieing

 

 

Jonas,

I've seen this happening on a weekly basis when I was running 0.61 branch as well, however after switching to 0.67 branch it has stopped. Perhaps you should try upgrading

Andrei

 


From: "Jonas Rottmann (centron GmbH)" <J.Rottmann@xxxxxxxxxx>
To: "
ceph-users@xxxxxxxx" <ceph-users@xxxxxxxx>
Sent: Saturday, 28 December, 2013 9:48:12 AM
Subject: [ceph-users] One OSD always dieing

Hi,

 

One of my OSDs are dieing all the time.  I rebooted one after one every node and assured that all has the same kernel version and glibc.

 

I’m using ceph version 0.61.9 (7440dcd135750839fa0f00263f80722ff6f51e90).

 

Dmesg only shows:

 

[ 5745.366041] init: ceph-osd (ceph/3) main process (2510) killed by ABRT signal

[ 5745.366235] init: ceph-osd (ceph/3) main process ended, respawning

[ 5763.824298] init: ceph-osd (ceph/3) main process (2991) killed by SEGV signal

 

Basically every time this shows up in the logs:

 

2013-12-28 06:35:08.489431 7fc9eccd5700 -1 osd/ReplicatedPG.cc: In function 'ReplicatedPG::RepGather* ReplicatedPG::trim_object(const hobject_t&)' thread 7fc9eccd5700 time 2013-12-28 06:35:08.487862

osd/ReplicatedPG.cc: 1379: FAILED assert(0)

 

If you need more infos I will send them. Please help ! The whole cluster isn’t working proberbly because of this…


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux