What distribution and kernel are you running?
I recently found my cluster running the 3.10 centos kernel when I thought it was running the elrepo kernel. After forcing it to boot correctly, my flapping osd issue went away.
On Tue, Apr 10, 2018, 2:18 AM Jan Marquardt <jm@xxxxxxxxxxx> wrote:
Hi,
we are experiencing massive problems with our Ceph setup. After starting
a "repair pg" because of scrub errors OSDs started to crash, which we
could not stop so far. We are running Ceph 12.2.4. Crashed OSDs are both
bluestore and filestore.
Our cluster currently looks like this:
# ceph -s
cluster:
id: c59e56df-2043-4c92-9492-25f05f268d9f
health: HEALTH_ERR
1 osds down
73005/17149710 objects misplaced (0.426%)
5 scrub errors
Reduced data availability: 2 pgs inactive, 2 pgs down
Possible data damage: 1 pg inconsistent
Degraded data redundancy: 611518/17149710 objects degraded
(3.566%), 86 pgs degraded, 86 pgs undersized
services:
mon: 3 daemons, quorum head1,head2,head3
mgr: head3(active), standbys: head2, head1
osd: 34 osds: 24 up, 25 in; 18 remapped pgs
data:
pools: 1 pools, 768 pgs
objects: 5582k objects, 19500 GB
usage: 62030 GB used, 31426 GB / 93456 GB avail
pgs: 0.260% pgs not active
611518/17149710 objects degraded (3.566%)
73005/17149710 objects misplaced (0.426%)
670 active+clean
75 active+undersized+degraded
8 active+undersized+degraded+remapped+backfill_wait
8 active+clean+remapped
2 down
2 active+undersized+degraded+remapped+backfilling
2 active+clean+scrubbing+deep
1 active+undersized+degraded+inconsistent
io:
client: 10911 B/s rd, 118 kB/s wr, 0 op/s rd, 54 op/s wr
recovery: 31575 kB/s, 8 objects/s
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 124.07297 root default
-2 29.08960 host ceph1
0 hdd 3.63620 osd.0 up 1.00000 1.00000
1 hdd 3.63620 osd.1 down 0 1.00000
2 hdd 3.63620 osd.2 up 1.00000 1.00000
3 hdd 3.63620 osd.3 up 1.00000 1.00000
4 hdd 3.63620 osd.4 down 0 1.00000
5 hdd 3.63620 osd.5 down 0 1.00000
6 hdd 3.63620 osd.6 up 1.00000 1.00000
7 hdd 3.63620 osd.7 up 1.00000 1.00000
-3 7.27240 host ceph2
14 hdd 3.63620 osd.14 up 1.00000 1.00000
15 hdd 3.63620 osd.15 up 1.00000 1.00000
-4 29.11258 host ceph3
16 hdd 3.63620 osd.16 up 1.00000 1.00000
18 hdd 3.63620 osd.18 down 0 1.00000
19 hdd 3.63620 osd.19 down 0 1.00000
20 hdd 3.65749 osd.20 up 1.00000 1.00000
21 hdd 3.63620 osd.21 up 1.00000 1.00000
22 hdd 3.63620 osd.22 up 1.00000 1.00000
23 hdd 3.63620 osd.23 up 1.00000 1.00000
24 hdd 3.63789 osd.24 down 0 1.00000
-9 29.29919 host ceph4
17 hdd 3.66240 osd.17 up 1.00000 1.00000
25 hdd 3.66240 osd.25 up 1.00000 1.00000
26 hdd 3.66240 osd.26 down 0 1.00000
27 hdd 3.66240 osd.27 up 1.00000 1.00000
28 hdd 3.66240 osd.28 down 0 1.00000
29 hdd 3.66240 osd.29 up 1.00000 1.00000
30 hdd 3.66240 osd.30 up 1.00000 1.00000
31 hdd 3.66240 osd.31 down 0 1.00000
-11 29.29919 host ceph5
32 hdd 3.66240 osd.32 up 1.00000 1.00000
33 hdd 3.66240 osd.33 up 1.00000 1.00000
34 hdd 3.66240 osd.34 up 1.00000 1.00000
35 hdd 3.66240 osd.35 up 1.00000 1.00000
36 hdd 3.66240 osd.36 down 1.00000 1.00000
37 hdd 3.66240 osd.37 up 1.00000 1.00000
38 hdd 3.66240 osd.38 up 1.00000 1.00000
39 hdd 3.66240 osd.39 up 1.00000 1.00000
The last OSDs that crashed are #28 and #36. Please find the
corresponding log files here:
http://af.janno.io/ceph/ceph-osd.28.log.1.gz
http://af.janno.io/ceph/ceph-osd.36.log.1.gz
The backtraces look almost the same for all crashed OSDs.
Any help, hint or advice would really be appreciated. Please let me know
if you need any further information.
Best Regards
Jan
--
Artfiles New Media GmbH | Zirkusweg 1 | 20359 Hamburg
Tel: 040 - 32 02 72 90 | Fax: 040 - 32 02 72 95
E-Mail: support@xxxxxxxxxxx | Web: http://www.artfiles.de
Geschäftsführer: Harald Oltmanns | Tim Evers
Eingetragen im Handelsregister Hamburg - HRB 81478
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com