I asked them on twitter, let’s hope they elaborate on that. But yeah, I bet someone "optimized" it with mount -o nobarrier …
Jan
"We responded immediately and confirmed the issue was related to filesystem corruption on our storage platform. This incident impacted all block devices on our Ceph cluster." Just guessing from that, I bet they lost power and discovered their local filesystems/disks were misconfigured to not be consistent in that scenario. (Ie, they lost data which had been acked as safe-on-disk.) would love if they discussed more what had actually gone wrong, though. (Note that they discuss filesystem corruption on the storage platform, and that apparently that fs corruption broke the block devices. If the ceph software was at fault I'd expect them to phrase it the other way around.) -Greg
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxxhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com