On Thu, 23 Aug 2012, Mandell Degerness wrote: > I know this is an old build, but I just want to verify that this isn't > an unknown bug. For context, the attached log covers the time from > when server .15 dropped off the net (we think power failure at this > point). OSDs 72, 73, 74, and 75 are on the node which apparently lost > power. > > Ceph version 0.47.2 > Kernel version 3.4.4 Yep, that looks like #2867 / #2790. The fix is a largish refactor of the messenger, so it won't go into the stable kernels, unfortunately. sage > > Patches applied: > rbd-don-t-hold-spinlock-during-messenger-flush.patch > crush-adjust-local-retry-threshold.patch > crush-be-more-tolerant-of-nonsensical-crush-maps.patchcrush-fix-tree-node-weight-lookup.patch > crush-fix-memory-leak-when-destroying-tree-buckets.patch > ceph-osd_client-fix-endianness-bug-in-osd_req_encode.patchrbd-protect-read-of-snapshot-sequence-number.patch > rbd-store-snapshot-id-instead-of-index.patchceph-don-t-set-WRITE_PENDING-too-early.patch > ceph-messenger-rework-prepare_connect_authorizer.patchceph-messenger-check-return-from-get_authorizer.patch > ceph-define-ceph_auth_handshake-type.patchceph-messenger-reduce-args-to-create_authorizer.patch > ceph-ensure-auth-ops-are-defined-before-use.patchceph-have-get_authorizer-methods-return-pointers.patch > ceph-use-info-returned-by-get_authorizer.patchceph-return-pointer-from-prepare_connect_authorizer.patch > ceph-rename-prepare_connect_authorizer.patchceph-add-auth-buf-in-prepare_write_connect.patch > libceph-avoid-unregistering-osd-request-when-not-reg.patchlibceph-fix-pg_temp-updates.patch > ceph-check-PG_Private-flag-before-accessing-page-pri.patchrbd-Fix-ceph_snap_context-size-calculation.patch > rbd-endian-bug-in-rbd_req_cb.patchlibceph-osd_client-don-t-drop-reply-reference-too-ea.patch > libceph-use-con-get-put-ops-from-osd_client.patchrbd-Clear-ceph_msg-bio_iter-for-retransmitted-messag.patch > libceph-flush-msgr-queue-during-mon_client-shutdown.patch > > Build options: > --with-cryptopp --without-nss --with-radosgw --without-fuse \ > --with-tcmalloc --without-hadoop --with-libatomic-ops \ > --with-system-libs3 --with-libaio --without-gtk2 \ > --localstatedir=/var > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html