Re: Help: pool not responding

Shinobu Kinjo <skinjo@xxxxxxxxxx> · Mon, 29 Feb 2016 17:33:37 -0500 (EST)

> Probably not (unless they reveal themselves extremely unreliable with
> Ceph OSD usage patterns which would be surprising to me).

Thank you for letting me know your thought.
That does make sense.

Cheers,

----- Original Message -----
From: "Lionel Bouton" <lionel-subscription@xxxxxxxxxxx>
To: "Shinobu Kinjo" <skinjo@xxxxxxxxxx>
Cc: "Mario Giammarco" <mgiammarco@xxxxxxxxx>, ceph-users@xxxxxxxxxxxxxx
Sent: Tuesday, March 1, 2016 6:56:05 AM
Subject: Re:  Help: pool not responding

Le 29/02/2016 22:50, Shinobu Kinjo a écrit :
>> the fact that they are optimized for benchmarks and certainly not
>> Ceph OSD usage patterns (with or without internal journal).
> Are you assuming that SSHD is causing the issue?
> If you could elaborate on this more, it would be helpful.

Probably not (unless they reveal themselves extremely unreliable with
Ceph OSD usage patterns which would be surprising to me).

For incomplete PG the documentation seems good enough for what should be
done :
http://docs.ceph.com/docs/master/rados/operations/pg-states/

The relevant text:

/Incomplete/
    Ceph detects that a placement group is missing information about
    writes that may have occurred, or does not have any healthy copies.
    If you see this state, try to start any failed OSDs that may contain
    the needed information or temporarily adjust min_size to allow recovery.

We don't have the full history but the most probable cause of these
incomplete PGs is that min_size is set to 2 or 3 and at some time the 4
incomplete pgs didn't have as many replica as the min_size value. So if
setting min_size to 2 isn't enough setting it to 1 should unfreeze them.

Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com