Multi-device BlueStore OSDs multiple fsck failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've been doing some benchmarking of BlueStore in 10.2.2 the last few days
and
have come across a failure that keeps happening after stressing the cluster
fairly heavily.  Some of the OSDs started failing and attempts to restart
them
fail to log anything in /var/log/ceph/, so I tried starting them manually
and
ran into these error messages:

# /usr/bin/ceph-osd --cluster=ceph -i 4 -f --setuser ceph --setgroup ceph
2016-08-02 22:52:01.190226 7f97d75e1800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-02 22:52:01.190340 7f97d75e1800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-02 22:52:01.190497 7f97d75e1800 -1 WARNING: experimental feature
'bluestore' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster.  Do not use
feature with important data.

starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4/
/var/lib/ceph/osd/ceph-4/journal
2016-08-02 22:52:01.194461 7f97d75e1800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-02 22:52:01.237619 7f97d75e1800 -1 WARNING: experimental feature
'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster.  Do not use
feature with important data.

2016-08-02 22:52:01.501405 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/)  a#20:bac03f87:::4_454:head# nid
67134 already in use
2016-08-02 22:52:01.629900 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/)  9#20:e64f44a7:::4_258:head# nid
78351 already in use
2016-08-02 22:52:01.967599 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) fsck free extent 256983760896~1245184
intersects allocated blocks
2016-08-02 22:52:01.967605 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) fsck overlap: [256984940544~65536]
2016-08-02 22:52:01.978635 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) fsck free extent 258455044096~196608
intersects allocated blocks
2016-08-02 22:52:01.978640 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) fsck overlap: [258455175168~65536]
2016-08-02 22:52:01.978647 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) fsck leaked some space; free+used =
[0~252138684416,252138815488~4844945408,256984940544~1470103552,25845517516
8~5732719067136] != expected 0~5991174242304
2016-08-02 22:52:02.987479 7f97d75e1800 -1
bluestore(/var/lib/ceph/osd/ceph-4/) mount fsck found 5 errors
2016-08-02 22:52:02.987488 7f97d75e1800 -1 osd.4 0 OSD:init: unable to
mount object store
2016-08-02 22:52:02.987498 7f97d75e1800 -1  ** ERROR: osd init failed: (5)
Input/output error


Here's another one:

# /usr/bin/ceph-osd --cluster=ceph -i 11 -f --setuser ceph --setgroup ceph
2016-08-03 22:16:49.052319 7f0e4d949800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-03 22:16:49.052445 7f0e4d949800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-03 22:16:49.052690 7f0e4d949800 -1 WARNING: experimental feature
'bluestore' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster.  Do not use
feature with important data.

starting osd.11 at :/0 osd_data /var/lib/ceph/osd/ceph-11/
/var/lib/ceph/osd/ceph-11/journal
2016-08-03 22:16:49.056779 7f0e4d949800 -1 WARNING: the following
dangerous and experimental features are enabled: *
2016-08-03 22:16:49.095695 7f0e4d949800 -1 WARNING: experimental feature
'rocksdb' is enabled
Please be aware that this feature is experimental, untested,
unsupported, and may result in data corruption, data loss,
and/or irreparable damage to your cluster.  Do not use
feature with important data.

2016-08-03 22:16:49.821451 7f0e4d949800 -1
bluestore(/var/lib/ceph/osd/ceph-11/)  6#20:2eed99bf:::4_257:head# nid
72869 already in use
2016-08-03 22:16:49.961943 7f0e4d949800 -1
bluestore(/var/lib/ceph/osd/ceph-11/) fsck free extent 257123155968~65536
intersects allocated blocks
2016-08-03 22:16:49.961950 7f0e4d949800 -1
bluestore(/var/lib/ceph/osd/ceph-11/) fsck overlap: [257123155968~65536]
2016-08-03 22:16:49.962012 7f0e4d949800 -1
bluestore(/var/lib/ceph/osd/ceph-11/) fsck leaked some space; free+used =
[0~241963433984,241963499520~5749210742784] != expected 0~5991174242304
2016-08-03 22:16:50.855099 7f0e4d949800 -1
bluestore(/var/lib/ceph/osd/ceph-11/) mount fsck found 3 errors
2016-08-03 22:16:50.855109 7f0e4d949800 -1 osd.11 0 OSD:init: unable to
mount object store
2016-08-03 22:16:50.855118 7f0e4d949800 -1  ** ERROR: osd init failed: (5)
Input/output error


I currently have a total of 12 OSDs down (out of 46) which all appear to be
experiencing this problem.

Here are more details of the cluster (currently just a single node):

2x Xeon E5-2699 v4 @ 2.20GHz
128GiB memory
2x LSI Logic SAS3008 HBAs
3x Intel DC P3700 NVMe cards
48x 6TB HDDs
OS: Ubuntu 14.04.4 w/ Xenial HWE kernel (4.4.0-29-generic)

I've split it up so that each NVMe card handles the BlueStore wal and db
partitions of 16 OSDs.

The testing has been done with 'rados bench' and 'cosbench' using a 10+3
erasure coding config.  Overall the performance is quite good (I'm seeing
about
74.6 MB/s per disk), but these failures halt my testing each time I run
into
them and then I have to rebuild the cluster to continue testing.

Let me know if there's any additional information you guys would like me to
gather.

Thanks,
Bryan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux