On Tue, Oct 1, 2019 at 10:43 PM Del Monaco, Andrea <andrea.delmonaco@xxxxxxxx> wrote:
Hi list,
After the nodes ran OOM and after reboot, we are not able to restart the ceph-osd@x services anymore. (Details about the setup at the end).
I am trying to do this manually, so we can see the error but all i see is several crash dumps - this is just one of the OSDs which is not starting. Any idea how to get past this??
[root@ceph001 ~]# /usr/bin/ceph-osd --debug_osd 10 -f --cluster ceph --id 83 --setuser ceph --setgroup ceph > /tmp/dump 2>&1
starting osd.83 at - osd_data /var/lib/ceph/osd/ceph-83 /var/lib/ceph/osd/ceph-83/journal
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
https://tracker.ceph.com/issues/41336 may be relevant here.
Can you post details of the pool involved as well as the erasure code profile in use for that pool?
_______________________________________________This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, Atos’ liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request.
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
*** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
-693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/
x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
-693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osd
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
-693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width % stripe_size == 0)
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x2aaaaaf3d36b]
2: (()+0x26e4f7) [0x2aaaaaf3d4f7]
3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
7: (OSD::load_pgs()+0x4a9) [0x555555917e39]
8: (OSD::init()+0xc99) [0x5555559238e9]
9: (main()+0x23a3) [0x5555558017a3]
10: (__libc_start_main()+0xf5) [0x2aaab77de495]
11: (()+0x385900) [0x5555558d9900]
-693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) **
in thread 2aaaaaaf5540 thread_name:ceph-osdEnvironment:
ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
1: (()+0xf5d0) [0x2aaab69765d0]
2: (gsignal()+0x37) [0x2aaab77f22c7]
3: (abort()+0x148) [0x2aaab77f39b8]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x248) [0x2aaaaaf3d468]
5: (()+0x26e4f7) [0x2aaaaaf3d4f7]
6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned long)+0x46d) [0x555555c0bd3d]
7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, CephContext*)+0x30a) [0x555555b0ba8a]
8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, spg_t)+0x140) [0x555555abd100]
9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) [0x555555914ecb]
10: (OSD::load_pgs()+0x4a9) [0x555555917e39]
11: (OSD::init()+0xc99) [0x5555559238e9]
12: (main()+0x23a3) [0x5555558017a3]
13: (__libc_start_main()+0xf5) [0x2aaab77de495]
14: (()+0x385900) [0x5555558d9900]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
[root@ceph001 ~]# uname -r
3.10.0-957.27.2.el7.x86_64
[root@ceph001 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[root@ceph001 ~]# rpm -qa | grep -i ceph
cm-config-ceph-release-mimic-8.2-73_cm8.2.noarch
ceph-13.2.6-0.el7.x86_64
ceph-selinux-13.2.6-0.el7.x86_64
ceph-base-13.2.6-0.el7.x86_64
ceph-osd-13.2.6-0.el7.x86_64
cm-config-ceph-radosgw-systemd-8.2-6_cm8.2.noarch
libcephfs2-13.2.6-0.el7.x86_64
ceph-common-13.2.6-0.el7.x86_64
ceph-mgr-13.2.6-0.el7.x86_64
cm-config-ceph-systemd-8.2-12_cm8.2.noarch
ceph-mon-13.2.6-0.el7.x86_64
python-cephfs-13.2.6-0.el7.x86_64
ceph-mds-13.2.6-0.el7.x86_64
ceph osd tree:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 785.95801 root default
-5 261.98599 host ceph001
1 hdd 7.27699 osd.1 up 1.00000 1.00000
3 hdd 7.27699 osd.3 down 1.00000 1.00000
6 hdd 7.27699 osd.6 down 1.00000 1.00000
9 hdd 7.27699 osd.9 down 0 1.00000
12 hdd 7.27699 osd.12 down 1.00000 1.00000
15 hdd 7.27699 osd.15 up 1.00000 1.00000
18 hdd 7.27699 osd.18 down 1.00000 1.00000
21 hdd 7.27699 osd.21 down 1.00000 1.00000
24 hdd 7.27699 osd.24 up 1.00000 1.00000
27 hdd 7.27699 osd.27 down 1.00000 1.00000
30 hdd 7.27699 osd.30 down 1.00000 1.00000
35 hdd 7.27699 osd.35 down 1.00000 1.00000
37 hdd 7.27699 osd.37 down 1.00000 1.00000
40 hdd 7.27699 osd.40 down 1.00000 1.00000
44 hdd 7.27699 osd.44 down 1.00000 1.00000
47 hdd 7.27699 osd.47 up 1.00000 1.00000
50 hdd 7.27699 osd.50 up 1.00000 1.00000
53 hdd 7.27699 osd.53 down 1.00000 1.00000
56 hdd 7.27699 osd.56 down 1.00000 1.00000
59 hdd 7.27699 osd.59 up 1.00000 1.00000
62 hdd 7.27699 osd.62 down 0 1.00000
65 hdd 7.27699 osd.65 down 1.00000 1.00000
68 hdd 7.27699 osd.68 down 1.00000 1.00000
71 hdd 7.27699 osd.71 down 1.00000 1.0000074 hdd 7.27699 osd.74 down 1.00000 1.00000
77 hdd 7.27699 osd.77 up 1.00000 1.00000
80 hdd 7.27699 osd.80 down 1.00000 1.00000
83 hdd 7.27699 osd.83 up 1.00000 1.00000
86 hdd 7.27699 osd.86 down 1.00000 1.00000
88 hdd 7.27699 osd.88 down 1.00000 1.00000
91 hdd 7.27699 osd.91 down 1.00000 1.00000
94 hdd 7.27699 osd.94 down 1.00000 1.00000
97 hdd 7.27699 osd.97 down 1.00000 1.00000
100 hdd 7.27699 osd.100 down 0 1.00000
103 hdd 7.27699 osd.103 down 1.00000 1.00000
106 hdd 7.27699 osd.106 up 1.00000 1.00000
-3 261.98599 host ceph002
0 hdd 7.27699 osd.0 down 0 1.00000
4 hdd 7.27699 osd.4 up 1.00000 1.00000
7 hdd 7.27699 osd.7 up 1.00000 1.00000
11 hdd 7.27699 osd.11 down 1.00000 1.00000
13 hdd 7.27699 osd.13 up 1.00000 1.00000
16 hdd 7.27699 osd.16 down 1.00000 1.00000
19 hdd 7.27699 osd.19 down 0 1.00000
23 hdd 7.27699 osd.23 up 1.00000 1.00000
26 hdd 7.27699 osd.26 down 0 1.00000
29 hdd 7.27699 osd.29 down 0 1.00000
32 hdd 7.27699 osd.32 down 0 1.00000
33 hdd 7.27699 osd.33 down 0 1.00000
36 hdd 7.27699 osd.36 down 0 1.00000
39 hdd 7.27699 osd.39 down 1.00000 1.00000
43 hdd 7.27699 osd.43 up 1.00000 1.0000046 hdd 7.27699 osd.46 up 1.00000 1.00000
49 hdd 7.27699 osd.49 down 1.00000 1.00000
52 hdd 7.27699 osd.52 down 1.00000 1.00000
55 hdd 7.27699 osd.55 down 0 1.00000
58 hdd 7.27699 osd.58 up 1.00000 1.00000
61 hdd 7.27699 osd.61 down 1.00000 1.00000
64 hdd 7.27699 osd.64 down 1.00000 1.00000
67 hdd 7.27699 osd.67 up 1.00000 1.00000
70 hdd 7.27699 osd.70 down 1.00000 1.00000
73 hdd 7.27699 osd.73 down 1.00000 1.00000
76 hdd 7.27699 osd.76 up 1.00000 1.00000
78 hdd 7.27699 osd.78 down 1.00000 1.00000
81 hdd 7.27699 osd.81 down 1.00000 1.00000
84 hdd 7.27699 osd.84 down 0 1.00000
87 hdd 7.27699 osd.87 down 1.00000 1.00000
90 hdd 7.27699 osd.90 down 0 1.00000
93 hdd 7.27699 osd.93 down 1.00000 1.00000
96 hdd 7.27699 osd.96 down 0 1.00000
99 hdd 7.27699 osd.99 down 0 1.00000
102 hdd 7.27699 osd.102 down 0 1.00000
105 hdd 7.27699 osd.105 up 1.00000 1.00000
-7 261.98599 host ceph003
2 hdd 7.27699 osd.2 up 1.00000 1.00000
5 hdd 7.27699 osd.5 down 1.00000 1.00000
8 hdd 7.27699 osd.8 up 1.00000 1.00000
10 hdd 7.27699 osd.10 down 0 1.00000
14 hdd 7.27699 osd.14 down 0 1.00000
17 hdd 7.27699 osd.17 up 1.00000 1.0000020 hdd 7.27699 osd.20 down 0 1.00000
22 hdd 7.27699 osd.22 down 0 1.00000
25 hdd 7.27699 osd.25 up 1.00000 1.00000
28 hdd 7.27699 osd.28 up 1.00000 1.00000
31 hdd 7.27699 osd.31 down 0 1.00000
34 hdd 7.27699 osd.34 down 0 1.00000
38 hdd 7.27699 osd.38 down 0 1.00000
41 hdd 7.27699 osd.41 down 1.00000 1.00000
42 hdd 7.27699 osd.42 down 0 1.00000
45 hdd 7.27699 osd.45 up 1.00000 1.00000
48 hdd 7.27699 osd.48 up 1.00000 1.00000
51 hdd 7.27699 osd.51 down 1.00000 1.00000
54 hdd 7.27699 osd.54 up 1.00000 1.00000
57 hdd 7.27699 osd.57 down 1.00000 1.00000
60 hdd 7.27699 osd.60 down 1.00000 1.00000
63 hdd 7.27699 osd.63 up 1.00000 1.00000
66 hdd 7.27699 osd.66 down 1.00000 1.00000
69 hdd 7.27699 osd.69 up 1.00000 1.00000
72 hdd 7.27699 osd.72 up 1.00000 1.00000
75 hdd 7.27699 osd.75 down 1.00000 1.00000
79 hdd 7.27699 osd.79 up 1.00000 1.00000
82 hdd 7.27699 osd.82 down 1.00000 1.00000
85 hdd 7.27699 osd.85 down 1.00000 1.00000
89 hdd 7.27699 osd.89 down 0 1.00000
92 hdd 7.27699 osd.92 down 1.00000 1.00000
95 hdd 7.27699 osd.95 down 0 1.00000
98 hdd 7.27699 osd.98 down 0 1.00000
101 hdd 7.27699 osd.101 down 1.00000 1.00000104 hdd 7.27699 osd.104 down 0 1.00000
107 hdd 7.27699 osd.107 up 1.00000 1.00000
Ceph status;
[root@ceph001 ~]# ceph status
cluster:
id: 54052e72-6835-410e-88a9-af4ac17a8113
health: HEALTH_WARN
1 filesystem is degraded
1 MDSs report slow metadata IOs
48 osds down
Reduced data availability: 2053 pgs inactive, 2043 pgs down, 7 pgs peering, 3 pgs incomplete, 126 pgs stale
Degraded data redundancy: 18473/27200783 objects degraded (0.068%), 106 pgs degraded, 103 pgs undersized
too many PGs per OSD (258 > max 250)
services:
mon: 3 daemons, quorum filler001,filler002,bezavrdat-master01
mgr: bezavrdat-master01(active), standbys: filler002, filler001
mds: cephfs-1/1/1 up {0=filler002=up:replay}, 1 up:standby
osd: 108 osds: 32 up, 80 in; 16 remapped pgs
data:
pools: 2 pools, 2176 pgs
objects: 2.73 M objects, 1.7 TiB
usage: 2.3 TiB used, 580 TiB / 582 TiB avail
pgs: 94.347% pgs not active
18473/27200783 objects degraded (0.068%)
1951 down
79 active+undersized+degraded
76 stale+down
23 stale+active+undersized+degraded
14 down+remapped
14 stale+active+clean
6 stale+peering
3 active+clean
3 stale+active+recovery_wait+degraded
2 incomplete
2 stale+down+remapped
1 stale+incomplete
1 stale+remapped+peering
1 active+recovering+undersized+degraded+remapped
Thank you in advance!
Regards,
Andrea Del Monaco
HPC Consultant – Big Data & Security
M: +31 612031174
Burgemeester Rijnderslaan 30 – 1185 MC Amstelveen – The Netherlands
atos.net
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Cheers,
Brad
Brad
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com