Hello,
I've found my ceph v 0.80.3 cluster in a state with 5 of 34 OSDs being down through night after months of running without change. From Linux logs I found out the OSD processes were killed because they consumed all available memory.
Those 5 failed OSDs were from different hosts of my 4-node cluster (see below). Two hosts act as SSD cache tier in some of my pools. The other two hosts are the default rotational drives storage.
After checking the Linux was not out of memory I've attempted to restart those failed OSDs. Most of those OSD daemon exhaust all memory in seconds and got killed by Linux again:
Oct 28 22:16:34 q07 kernel: Out of memory: Kill process 24207 (ceph-osd) score 867 or sacrifice childOct 28 22:16:34 q07 kernel: Killed process 24207, UID 0, (ceph-osd) total-vm:59974412kB, anon-rss:59076880kB, file-rss:512kB
On the host I've found lots of similar "slow request" messages preceding the crash:
2014-10-28 22:11:20.885527 7f25f84d1700 0 log [WRN] : slow request 31.117125 seconds old, received at 2014-10-28 22:10:49.768291: osd_sub_op(client.168752.0:2197931 14.2c7 888596c7/rbd_data.293272f8695e4.000000000000006f/head//14 [] v 1551'377417 snapset=0=[]:[] snapc=0=[]) v10 currently no flag points reached2014-10-28 22:11:21.885668 7f25f84d1700 0 log [WRN] : 67 slow requests, 1 included below; oldest blocked for > 9879.304770 secs
Apparently I can't get the cluster fixed by restarting the OSDs all over again. Is there any other option then?
Thank you.
Lukas Kubin
[root@q04 ~]# ceph -s
cluster ec433b4a-9dc0-4d08-bde4-f1657b1fdb99
health HEALTH_ERR 9 pgs backfill; 1 pgs backfilling; 521 pgs degraded; 425 pgs incomplete; 13 pgs inconsistent; 20 pgs recovering; 50 pgs recovery_wait; 151 pgs stale; 425 pgs stuck inactive; 151 pgs stuck stale; 1164 pgs stuck unclean; 12070270 requests are blocked > 32 sec; recovery 887322/35206223 objects degraded (2.520%); 119/17131232 unfound (0.001%); 13 scrub errors
monmap e2: 3 mons at {q03=10.255.253.33:6789/0,q04=10.255.253.34:6789/0,q05=10.255.253.35:6789/0}, election epoch 90, quorum 0,1,2 q03,q04,q05
osdmap e2194: 34 osds: 31 up, 31 in
pgmap v7429812: 5632 pgs, 7 pools, 1446 GB data, 16729 kobjects
2915 GB used, 12449 GB / 15365 GB avail
887322/35206223 objects degraded (2.520%); 119/17131232 unfound (0.001%)
38 active+recovery_wait+remapped
4455 active+clean
65 stale+incomplete
3 active+recovering+remapped
359 incomplete
12 active+recovery_wait
139 active+remapped
86 stale+active+degraded
16 active+recovering
1 active+remapped+backfilling
13 active+clean+inconsistent
9 active+remapped+wait_backfill
434 active+degraded
1 remapped+incomplete
1 active+recovering+degraded+remapped
client io 0 B/s rd, 469 kB/s wr, 48 op/s
[root@q04 ~]# ceph osd tree
# id weight type name up/down reweight
-5 3.24 root ssd
-6 1.62 host q06
16 0.18 osd.16 up 1
17 0.18 osd.17 up 1
18 0.18 osd.18 up 1
19 0.18 osd.19 up 1
20 0.18 osd.20 up 1
21 0.18 osd.21 up 1
22 0.18 osd.22 up 1
23 0.18 osd.23 up 1
24 0.18 osd.24 up 1
-7 1.62 host q07
25 0.18 osd.25 up 1
26 0.18 osd.26 up 1
27 0.18 osd.27 up 1
28 0.18 osd.28 up 1
29 0.18 osd.29 up 1
30 0.18 osd.30 up 1
31 0.18 osd.31 up 1
32 0.18 osd.32 up 1
33 0.18 osd.33 up 1
-1 14.56 root default
-4 14.56 root sata
-2 7.28 host q08
0 0.91 osd.0 up 1
1 0.91 osd.1 up 1
2 0.91 osd.2 up 1
3 0.91 osd.3 up 1
11 0.91 osd.11 up 1
12 0.91 osd.12 up 1
13 0.91 osd.13 down 0
14 0.91 osd.14 up 1
-3 7.28 host q09
4 0.91 osd.4 up 1
5 0.91 osd.5 up 1
6 0.91 osd.6 up 1
7 0.91 osd.7 up 1
8 0.91 osd.8 down 0
9 0.91 osd.9 up 1
10 0.91 osd.10 down 0
15 0.91 osd.15 up 1
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com