osd crash and high server load - ceph-osd crashes with stacktrace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

We've noticed a problem with our cluster setup:

4 x OSD nodes:
E5-1630 CPU
32 GB RAM
Mellanox MT27520 56Gbps network cards
SATA controller LSI Logic SAS3008
Storage nodes are connected to two SuperMicro chassis: 847E1C-R1K28JBOD
Each node has 2-3 spinning OSDs (6TB drives) and 2 ssd drives (240GB Intel DC S3710 drives) for journal and cache
3 monitors running on OSD nodes
ceph hammer 0.94.3
Ubuntu 14.04
standard replicated pools with size 2 (min_size 1)
40GB journal per osd on SSD drives, 40GB flashcache per osd.

Everything seems to work fine, but every few days or so one of the nodes (not always the same node - different nodes each time) gets very high load, becomes inaccessible and needs to be rebooted.

After reboot we can start osd's and the cluster returns to HEALTH_OK state pretty quickly.

After looking into logfiles this seems to be related to ceph-osd processes (links to the logs are at the bottom of this msg).

The cluster is a test setup - not used in production and at the time the ceph-osd processes crushes the cluster isn't doing anything.

Any help would be appreciated.

ceph-osd log: http://pastebin.com/AGGtvHr2
kernel log: http://pastebin.com/jVSa8eme

J

--
Jacek Jarosiewicz
Administrator Systemów Informatycznych

----------------------------------------------------------------------------------------
SUPERMEDIA Sp. z o.o. z siedzibą w Warszawie
ul. Senatorska 13/15, 00-075 Warszawa
Sąd Rejonowy dla m.st.Warszawy, XII Wydział Gospodarczy Krajowego Rejestru Sądowego,
nr KRS 0000029537; kapitał zakładowy 42.756.000 zł
NIP: 957-05-49-503
Adres korespondencyjny: ul. Jubilerska 10, 04-190 Warszawa

----------------------------------------------------------------------------------------
SUPERMEDIA ->   http://www.supermedia.pl
dostep do internetu - hosting - kolokacja - lacza - telefonia

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux