Re: What's the actual justification for min_size?

Richard Hesketh <richard.hesketh@xxxxxxxxxxxx> · Wed, 22 Mar 2017 11:23:28 +0000

I definitely saw it on a Hammer cluster, though I decided to check my IRC logs for more context and found that in my specific cases it was due to PGs going incomplete. `ceph health detail` offered the following, for instance:

pg 8.31f is remapped+incomplete, acting [39] (reducing pool one min_size from 2 may help; search ceph.com/docs for 'incomplete')

And I had to do it on at least a couple of occasions while managing that cluster. I don't remember ever having the issue again after going to Infernalis and beyond, though. FWIW it was a 60-disk cluster with an above-average failure rate cause many of my disks were donations from another project and were several years old already.

I guess my curiosity is sated - min_size is relevant when you're also considering the transient faults that may take disks down and up to prevent inconsistent state and lost writes. It's not so relevant when you're talking about complete disk failures, because if a replica is irretrievably lost all you can do is rebuild it anyway, and you're only $size badly-timed disk fails away from losing a PG entirely regardless of the setting of min_size.

On 21/03/17 23:14, Anthony D'Atri wrote:
> I’m fairly sure I saw it as recently as Hammer, definitely Firefly. YMMV.
> 
> 
>> On Mar 21, 2017, at 4:09 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>
>> You shouldn't need to set min_size to 1 in order to heal any more. That was the case a long time ago but it's been several major LTS releases now. :)
>> So: just don't ever set min_size to 1.
>> -Greg
>> On Tue, Mar 21, 2017 at 6:04 PM Anthony D'Atri <aad@xxxxxxxxxxxxxx> wrote:
>>>> a min_size of 1 is dangerous though because it means you are 1 hard disk failure away from losing the objects within that placement group entirely. a min_size of 2 is generally considered the minimum you want but many people ignore that advice, some wish they hadn't.
>>>
>>> I admit I am having difficulty following why this is the case
>>
>> I think we have a case of fervently agreeing.
>>
>> Setting min_size on a specific pool to 1 to allow PG’s to heal is absolutely a normal thing in certain circumstances, but it’s important to
>>
>> 1) Know _exactly_ what you’re doing, to which pool, and why
>> 2) Do it very carefully, changing ‘size’ instead of ‘min_size’ on a busy pool with a bunch of PG’s and data can be quite the rude awakening.
>> 3) Most importantly, _only_ set it for the minimum time needed, with eyes watching the healing, and set it back immediately after all affected PG’s have peered and healed.
>>
>> The danger, which I think is what Wes was getting at, is in leaving it set to 1 all the time, or forgetting to revert it.  THAT is, as we used to say, begging to lose.
>>
>> — aad

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com