Hi, During an ongoing recovery in one of my clusters a couple of OSDs complained about too small journal. For instance: 2012-05-12 13:31:04.034144 7f491061d700 1 journal check_for_full at 863363072 : JOURNAL FULL 863363072 >= 1048571903 (max_size 1048576000 start 863363072) 2012-05-12 13:31:04.034680 7f491061d700 0 journal JOURNAL TOO SMALL: item 1693745152 > journal 1048571904 (usable) I was under the impression that the OSDs stopped participating in recovery after this event. (ceph -w showed that the number of PGs in state active+clean no longer increased.) They resumed recovery after I enlarged their journals (stop osd, --flush-journal, --mkjournal, start osd). How serious is such situation? Do the OSDs know how to handle it correctly? Or could this result in some data loss or corruption? After the recovery finished (ceph -w showed that all PGs are in active+clean state) I noticed that a few rbd images were corrupted. The cluster runs v0.46. The OSDs use ext4. I'm pretty sure that during the recovery no clients were accessing the cluster. Best regards, Karol -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html