Re: Requests blocked in degraded erasure coded pool

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 07 Jun 2017 19:40:45 +0000

On Wed, Jun 7, 2017 at 12:30 PM Jonas Jaszkowic <jonasjaszkowic@xxxxxxxxxxxxxx> wrote:

Am 07.06.2017 um 20:29 schrieb Gregory Farnum <gfarnum@xxxxxxxxxx>:

We prevent PGs from going active (and serving writes or reads) when they have less than "min_size" OSDs participating. This is generally set so that we have enough redundancy to recover from at least one OSD failing.

Do you mean the min_size value from the crush rule? I set min_size = 2, so a 2+3 EC pool with 3 killed OSDs still has the minimum amount of 2 OSDs and should be ableto fully recover data, right?

If you set min_size 2 before taking the OSDs down, that does seem odd.

In your case, you have 2 OSDs and the failure of either one of them results in the loss of all written data. So we don't let you go active as it's not safe.

I get that it makes no sense to serve writes at this point because we cannot provide the desired redundancy, but how is preventing me from going active more safe than just serving reads? I think what bugs me is that by definition of the used erasure code, we should be able to loose 3 OSDs and still get our data back - which is not the case in this scenario because our cluster refuses to go active.

Yeah, we just don't have a way of serving reads without serving writes at the moment. It's a limit of the architecture.

-Greg
PS: please keep this on the list. It spreads the information and archives it for future reference by others. :)

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com