Re: osd crash and high server load - ceph-osd crashes with stacktrace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We've upgraded ceph to 0.94.4 and kernel to 3.16.0-51-generic
but the problem still persists. Lately we see these crashes on a daily basis. I'm leaning toward the conclusion that this is a software problem - this hardware ran stable before and we're seeing all four nodes crash randomly with the same messages in log.. I'm thinking if this can be flashcache related.. nothing else comes to mind..

can anyone look at the logs and help some?

ceph-osd log: http://pastebin.com/AGGtvHr2
kernel log: http://pastebin.com/jVSa8eme

J

On 10/09/2015 09:15 AM, Jacek Jarosiewicz wrote:
Hi,

We've noticed a problem with our cluster setup:

4 x OSD nodes:
E5-1630 CPU
32 GB RAM
Mellanox MT27520 56Gbps network cards
SATA controller LSI Logic SAS3008
Storage nodes are connected to two SuperMicro chassis: 847E1C-R1K28JBOD
Each node has 2-3 spinning OSDs (6TB drives) and 2 ssd drives (240GB
Intel DC S3710 drives) for journal and cache
3 monitors running on OSD nodes
ceph hammer 0.94.3
Ubuntu 14.04
standard replicated pools with size 2 (min_size 1)
40GB journal per osd on SSD drives, 40GB flashcache per osd.

Everything seems to work fine, but every few days or so one of the nodes
(not always the same node - different nodes each time) gets very high
load, becomes inaccessible and needs to be rebooted.

After reboot we can start osd's and the cluster returns to HEALTH_OK
state pretty quickly.

After looking into logfiles this seems to be related to ceph-osd
processes (links to the logs are at the bottom of this msg).

The cluster is a test setup - not used in production and at the time the
ceph-osd processes crushes the cluster isn't doing anything.

Any help would be appreciated.

ceph-osd log: http://pastebin.com/AGGtvHr2
kernel log: http://pastebin.com/jVSa8eme

J



--
Jacek Jarosiewicz
Administrator Systemów Informatycznych

----------------------------------------------------------------------------------------
SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego Rejestru Sądowego,
nr KRS 0000029537; kapitał zakładowy 42.756.000 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa

----------------------------------------------------------------------------------------
SUPERMEDIA ->   http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux