Hi list,
After the nodes ran OOM and after reboot, we are not able to restart the ceph-osd@x services anymore. (Details about the setup at the end).
I am trying to do this manually, so we can see the error but all i see is several crash dumps - this is just one of the OSDs which is not starting. Any idea how to get past this??
[root@ceph001 ~]# /usr/bin/ceph-osd --debug_osd 10 -f --cluster ceph --id 83 --setuser ceph --setgroup ceph > /tmp/dump 2>&1 starting osd.83 at - osd_data /var/lib/ceph/osd/ceph-83 /var/lib/ceph/osd/ceph-83/journal /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t,
uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t,
uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
-693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/
x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
-693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
-693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function
'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
-693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&,
ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[root@ceph001 ~]# uname -r
3.10.0-957.27.2.el7.x86_64
[root@ceph001 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[root@ceph001 ~]# rpm -qa | grep -i ceph
cm-config-ceph-release-mimic-8.2-73_cm8.2.noarch
ceph-13.2.6-0.el7.x86_64
ceph-selinux-13.2.6-0.el7.x86_64
ceph-base-13.2.6-0.el7.x86_64
ceph-osd-13.2.6-0.el7.x86_64
cm-config-ceph-radosgw-systemd-8.2-6_cm8.2.noarch
libcephfs2-13.2.6-0.el7.x86_64
ceph-common-13.2.6-0.el7.x86_64
ceph-mgr-13.2.6-0.el7.x86_64
cm-config-ceph-systemd-8.2-12_cm8.2.noarch
ceph-mon-13.2.6-0.el7.x86_64
python-cephfs-13.2.6-0.el7.x86_64
ceph-mds-13.2.6-0.el7.x86_64
ceph osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 785.95801 root default
-5 261.98599 host ceph001
1 hdd 7.27699 osd.1 up 1.00000 1.00000
3 hdd 7.27699 osd.3 down 1.00000 1.00000
6 hdd 7.27699 osd.6 down 1.00000 1.00000
9 hdd 7.27699 osd.9 down 0 1.00000
12 hdd 7.27699 osd.12 down 1.00000 1.00000
15 hdd 7.27699 osd.15 up 1.00000 1.00000
18 hdd 7.27699 osd.18 down 1.00000 1.00000
21 hdd 7.27699 osd.21 down 1.00000 1.00000
24 hdd 7.27699 osd.24 up 1.00000 1.00000
27 hdd 7.27699 osd.27 down 1.00000 1.00000
30 hdd 7.27699 osd.30 down 1.00000 1.00000
35 hdd 7.27699 osd.35 down 1.00000 1.00000
37 hdd 7.27699 osd.37 down 1.00000 1.00000
40 hdd 7.27699 osd.40 down 1.00000 1.00000
44 hdd 7.27699 osd.44 down 1.00000 1.00000
47 hdd 7.27699 osd.47 up 1.00000 1.00000
50 hdd 7.27699 osd.50 up 1.00000 1.00000
53 hdd 7.27699 osd.53 down 1.00000 1.00000
56 hdd 7.27699 osd.56 down 1.00000 1.00000
59 hdd 7.27699 osd.59 up 1.00000 1.00000
62 hdd 7.27699 osd.62 down 0 1.00000
65 hdd 7.27699 osd.65 down 1.00000 1.00000
68 hdd 7.27699 osd.68 down 1.00000 1.00000
71 hdd 7.27699 osd.71 down 1.00000 1.00000
74 hdd 7.27699 osd.74 down 1.00000 1.00000
77 hdd 7.27699 osd.77 up 1.00000 1.00000
80 hdd 7.27699 osd.80 down 1.00000 1.00000
83 hdd 7.27699 osd.83 up 1.00000 1.00000
86 hdd 7.27699 osd.86 down 1.00000 1.00000
88 hdd 7.27699 osd.88 down 1.00000 1.00000
91 hdd 7.27699 osd.91 down 1.00000 1.00000
94 hdd 7.27699 osd.94 down 1.00000 1.00000
97 hdd 7.27699 osd.97 down 1.00000 1.00000
100 hdd 7.27699 osd.100 down 0 1.00000
103 hdd 7.27699 osd.103 down 1.00000 1.00000
106 hdd 7.27699 osd.106 up 1.00000 1.00000
-3 261.98599 host ceph002
0 hdd 7.27699 osd.0 down 0 1.00000
4 hdd 7.27699 osd.4 up 1.00000 1.00000
7 hdd 7.27699 osd.7 up 1.00000 1.00000
11 hdd 7.27699 osd.11 down 1.00000 1.00000
13 hdd 7.27699 osd.13 up 1.00000 1.00000
16 hdd 7.27699 osd.16 down 1.00000 1.00000
19 hdd 7.27699 osd.19 down 0 1.00000
23 hdd 7.27699 osd.23 up 1.00000 1.00000
26 hdd 7.27699 osd.26 down 0 1.00000
29 hdd 7.27699 osd.29 down 0 1.00000
32 hdd 7.27699 osd.32 down 0 1.00000
33 hdd 7.27699 osd.33 down 0 1.00000
36 hdd 7.27699 osd.36 down 0 1.00000
39 hdd 7.27699 osd.39 down 1.00000 1.00000
43 hdd 7.27699 osd.43 up 1.00000 1.00000
46 hdd 7.27699 osd.46 up 1.00000 1.00000
49 hdd 7.27699 osd.49 down 1.00000 1.00000
52 hdd 7.27699 osd.52 down 1.00000 1.00000
55 hdd 7.27699 osd.55 down 0 1.00000
58 hdd 7.27699 osd.58 up 1.00000 1.00000
61 hdd 7.27699 osd.61 down 1.00000 1.00000
64 hdd 7.27699 osd.64 down 1.00000 1.00000
67 hdd 7.27699 osd.67 up 1.00000 1.00000
70 hdd 7.27699 osd.70 down 1.00000 1.00000
73 hdd 7.27699 osd.73 down 1.00000 1.00000
76 hdd 7.27699 osd.76 up 1.00000 1.00000
78 hdd 7.27699 osd.78 down 1.00000 1.00000
81 hdd 7.27699 osd.81 down 1.00000 1.00000
84 hdd 7.27699 osd.84 down 0 1.00000
87 hdd 7.27699 osd.87 down 1.00000 1.00000
90 hdd 7.27699 osd.90 down 0 1.00000
93 hdd 7.27699 osd.93 down 1.00000 1.00000
96 hdd 7.27699 osd.96 down 0 1.00000
99 hdd 7.27699 osd.99 down 0 1.00000
102 hdd 7.27699 osd.102 down 0 1.00000
105 hdd 7.27699 osd.105 up 1.00000 1.00000
-7 261.98599 host ceph003
2 hdd 7.27699 osd.2 up 1.00000 1.00000
5 hdd 7.27699 osd.5 down 1.00000 1.00000
8 hdd 7.27699 osd.8 up 1.00000 1.00000
10 hdd 7.27699 osd.10 down 0 1.00000
14 hdd 7.27699 osd.14 down 0 1.00000
17 hdd 7.27699 osd.17 up 1.00000 1.00000
20 hdd 7.27699 osd.20 down 0 1.00000
22 hdd 7.27699 osd.22 down 0 1.00000
25 hdd 7.27699 osd.25 up 1.00000 1.00000
28 hdd 7.27699 osd.28 up 1.00000 1.00000
31 hdd 7.27699 osd.31 down 0 1.00000
34 hdd 7.27699 osd.34 down 0 1.00000
38 hdd 7.27699 osd.38 down 0 1.00000
41 hdd 7.27699 osd.41 down 1.00000 1.00000
42 hdd 7.27699 osd.42 down 0 1.00000
45 hdd 7.27699 osd.45 up 1.00000 1.00000
48 hdd 7.27699 osd.48 up 1.00000 1.00000
51 hdd 7.27699 osd.51 down 1.00000 1.00000
54 hdd 7.27699 osd.54 up 1.00000 1.00000
57 hdd 7.27699 osd.57 down 1.00000 1.00000
60 hdd 7.27699 osd.60 down 1.00000 1.00000
63 hdd 7.27699 osd.63 up 1.00000 1.00000
66 hdd 7.27699 osd.66 down 1.00000 1.00000
69 hdd 7.27699 osd.69 up 1.00000 1.00000
72 hdd 7.27699 osd.72 up 1.00000 1.00000
75 hdd 7.27699 osd.75 down 1.00000 1.00000
79 hdd 7.27699 osd.79 up 1.00000 1.00000
82 hdd 7.27699 osd.82 down 1.00000 1.00000
85 hdd 7.27699 osd.85 down 1.00000 1.00000
89 hdd 7.27699 osd.89 down 0 1.00000
92 hdd 7.27699 osd.92 down 1.00000 1.00000
95 hdd 7.27699 osd.95 down 0 1.00000
98 hdd 7.27699 osd.98 down 0 1.00000
101 hdd 7.27699 osd.101 down 1.00000 1.00000
104 hdd 7.27699 osd.104 down 0 1.00000
107 hdd 7.27699 osd.107 up 1.00000 1.00000
Ceph status;
[root@ceph001 ~]# ceph status cluster:
id: 54052e72-6835-410e-88a9-af4ac17a8113
health: HEALTH_WARN
1 filesystem is degraded
1 MDSs report slow metadata IOs
48 osds down
Reduced data availability: 2053 pgs inactive, 2043 pgs down, 7 pgs peering, 3 pgs incomplete, 126 pgs stale
Degraded data redundancy: 18473/27200783 objects degraded (0.068%), 106 pgs degraded, 103 pgs undersized
too many PGs per OSD (258 > max 250)
services:
mon: 3 daemons, quorum filler001,filler002,bezavrdat-master01
mgr: bezavrdat-master01(active), standbys: filler002, filler001
mds: cephfs-1/1/1 up {0=filler002=up:replay}, 1 up:standby
osd: 108 osds: 32 up, 80 in; 16 remapped pgs
data:
pools: 2 pools, 2176 pgs
objects: 2.73 M objects, 1.7 TiB
usage: 2.3 TiB used, 580 TiB / 582 TiB avail
pgs: 94.347% pgs not active
18473/27200783 objects degraded (0.068%)
1951 down
79 active+undersized+degraded
76 stale+down
23 stale+active+undersized+degraded
14 down+remapped
14 stale+active+clean
6 stale+peering
3 active+clean
3 stale+active+recovery_wait+degraded
2 incomplete
2 stale+down+remapped
1 stale+incomplete
1 stale+remapped+peering
1 active+recovering+undersized+degraded+remapped
Thank you in advance!
Regards,
|
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx