Re: best practices for expanding hammer cluster

Richard Hesketh <richard.hesketh@xxxxxxxxxxxx> · Wed, 19 Jul 2017 16:44:52 +0100

In my case my cluster is under very little active load and so I have never had to be concerned about recovery operations impacting on client traffic. In fact, I generally tune up from the defaults (increase osx max backfills) to improve recovery speed when I'm doing major changes, because there's plenty of spare capacity in the cluster; and either way I'm in the fortunate position where I can place a higher value on having a HEALTH_OK cluster ASAP than on the client I/O being consistent.

Rich

On 19/07/17 16:27, Laszlo Budai wrote:
> Hi Rich,
> 
> Thank you for your answer. This is good news to hear :)
> Regarding the reconfiguration you've done: if I understand correctly, you have changed it all at once (like download the crush map, edit it - add all the new OSDs, and upload the new map to the cluster). How did you controlled the impact of the recovery/refilling operation on your clients' data traffic? What setting have you used to avoid slow requests?
> 
> Kind regards,
> Laszlo
> 
> 
> On 19.07.2017 17:40, Richard Hesketh wrote:
>> On 19/07/17 15:14, Laszlo Budai wrote:
>>> Hi David,
>>>
>>> Thank you for that reference about CRUSH. It's a nice one.
>>> There I could read about expanding the cluster, but in one of my cases we want to do more: we want to move from host failure domain to chassis failure domain. Our concern is: how will ceph behave for those PGs where all the three replicas currently are in the same chassis? Because in this case according to the new CRUSH map two replicas are in the wrong place.
>>>
>>> Kind regards,
>>> Laszlo
>>
>> Changing crush rules resulting in PGs being remapped works exactly the same way as changes in crush weights causing remapped data. The PGs will be remapped in accordance with the new crushmap/rules and then recovery operations will copy them over to the new OSDs as usual. Even if a PG is entirely remapped, the OSDs that were originally hosting it will operate as an acting set and continue to serve I/O and replicate data until copies on the new OSDs are ready to take over - ceph won't throw an upset because the acting set doesn't comply with the crush rules. I have done, for instance, a crush rule change which resulted in an entire pool being entirely remapped - switching the cephfs metadata pool from an HDD root to an SSD root rule, so every single PG was moved to a completely different set of OSDs - and it all continued to work fine while recovery took place.
>>
>> Rich

Attachment:
signature.asc

Description: OpenPGP digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com