Re: Another cluster completely hang

Lionel Bouton <lionel+ceph@xxxxxxxxxxx> · Wed, 29 Jun 2016 15:34:08 +0200

Hi,

Le 29/06/2016 12:00, Mario Giammarco a écrit :
> Now the problem is that ceph has put out two disks because scrub  has
> failed (I think it is not a disk fault but due to mark-complete)

There is something odd going on. I've only seen deep-scrub failing (ie
detect one inconsistency and marking the pg so) so I'm not sure what
happens in the case of a "simple" scrub failure but what should not
happen is the whole OSD going down on scrub of deepscrub fairure which
you seem to imply did happen.
Do you have logs for these two failures giving a hint at what happened
(probably /var/log/ceph/ceph-osd.<n>.log) ? Any kernel log pointing to
hardware failure(s) around the time these events happened ?

Another point : you said that you had one disk "broken". Usually ceph
handles this case in the following manner :
- the OSD detects the problem and commit suicide (unless it's configured
to ignore IO errors which is not the default),
- your cluster is then in degraded state with one OSD down/in,
- after a timeout (several minutes), Ceph decides that the OSD won't
come up again soon and marks the OSD "out" (so one OSD down/out),
- as the OSD is out, crush adapts pg positions based on the remaining
available OSDs and bring back all degraded pg to clean state by creating
missing replicas while moving pgs around. You see a lot of IO, many pg
in wait_backfill/backfilling states at this point,
- when all is done the cluster is back to HEALTH_OK

When your disk was broken and you waited 24 hours how far along this
process was your cluster ?

Best regards,

Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com