Hi, I was doing some performance tuning on test cluster of just 2 nodes (each 10 OSDs). I have test pool of 2 replicas (size=2, min_size=2) then one of OSD crashed due to failing harddrive. All remaining OSDs were fine, but health status reported one lost object.. here's detail: "recovery_state": [ { "name": "Started\/Primary\/Active", "enter_time": "2016-05-04 07:59:10.706866", "might_have_unfound": [ { "osd": "0", "status": "osd is down" }, { "osd": "10", "status": "already probed" } ], it was no important data, so I just discarded it as I don't need to recover it, but now I'm wondering what is the cause of all this.. I have min_size set to 2 and I though that writes are confirmed after they reach all target OSD journals, no? Is there something specific I should check? Maybe I have some bug in configuration? Or how else could this object be lost? I'd be grateful for any info br nik -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@xxxxxxxxxxx -------------------------------------
Attachment:
pgpIGQZuXJwb_.pgp
Description: PGP signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com