I know this is an old build, but I just want to verify that this isn't an unknown bug. For context, the attached log covers the time from when server .15 dropped off the net (we think power failure at this point). OSDs 72, 73, 74, and 75 are on the node which apparently lost power. Ceph version 0.47.2 Kernel version 3.4.4 Patches applied: rbd-don-t-hold-spinlock-during-messenger-flush.patch crush-adjust-local-retry-threshold.patch crush-be-more-tolerant-of-nonsensical-crush-maps.patchcrush-fix-tree-node-weight-lookup.patch crush-fix-memory-leak-when-destroying-tree-buckets.patch ceph-osd_client-fix-endianness-bug-in-osd_req_encode.patchrbd-protect-read-of-snapshot-sequence-number.patch rbd-store-snapshot-id-instead-of-index.patchceph-don-t-set-WRITE_PENDING-too-early.patch ceph-messenger-rework-prepare_connect_authorizer.patchceph-messenger-check-return-from-get_authorizer.patch ceph-define-ceph_auth_handshake-type.patchceph-messenger-reduce-args-to-create_authorizer.patch ceph-ensure-auth-ops-are-defined-before-use.patchceph-have-get_authorizer-methods-return-pointers.patch ceph-use-info-returned-by-get_authorizer.patchceph-return-pointer-from-prepare_connect_authorizer.patch ceph-rename-prepare_connect_authorizer.patchceph-add-auth-buf-in-prepare_write_connect.patch libceph-avoid-unregistering-osd-request-when-not-reg.patchlibceph-fix-pg_temp-updates.patch ceph-check-PG_Private-flag-before-accessing-page-pri.patchrbd-Fix-ceph_snap_context-size-calculation.patch rbd-endian-bug-in-rbd_req_cb.patchlibceph-osd_client-don-t-drop-reply-reference-too-ea.patch libceph-use-con-get-put-ops-from-osd_client.patchrbd-Clear-ceph_msg-bio_iter-for-retransmitted-messag.patch libceph-flush-msgr-queue-during-mon_client-shutdown.patch Build options: --with-cryptopp --without-nss --with-radosgw --without-fuse \ --with-tcmalloc --without-hadoop --with-libatomic-ops \ --with-system-libs3 --with-libaio --without-gtk2 \ --localstatedir=/var
Attachment:
cephoops
Description: Binary data