Re: Hammer reduce recovery impact

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Thu, 10 Sep 2015 16:20:15 -0600



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

I don't think the script will help our situation as it is just setting
osd_max_backfill from 1 to 0. It looks like that change doesn't go
into effect until after it finishes the PG. It would be nice if
backfill/recovery would skip the journal, but there would have to be
some logic if the obect was changed as it was being replicated. Maybe
just a log in the journal that the objects are starting restore and
finished restore, then the journal flush knows if it needs to commit
the write?
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Sep 10, 2015 at 3:33 PM, Lionel Bouton  wrote:
> Le 10/09/2015 22:56, Robert LeBlanc a écrit :
>> We are trying to add some additional OSDs to our cluster, but the
>> impact of the backfilling has been very disruptive to client I/O and
>> we have been trying to figure out how to reduce the impact. We have
>> seen some client I/O blocked for more than 60 seconds. There has been
>> CPU and RAM head room on the OSD nodes, network has been fine, disks
>> have been busy, but not terrible.
>
> It seems you've already exhausted most of the ways I know. When
> confronted to this situation, I used a simple script to throttle
> backfills (freezing them, then re-enabling them), this helped our VMs at
> the time but you must be prepared for very long migrations and some
> experimentations with different schedulings. You simply pass it the
> number of seconds backfills are allowed to proceed then the number of
> seconds during them they pause.
>
> Here's the script, which should be self-explanatory:
> http://pastebin.com/sy7h1VEy
>
> something like :
>
> ./throttler 10 120
>
> limited the impact on our VMs (the idea being that during the 10s the
> backfill won't be able to trigger filestore syncs and the 120s pause
> will allow the filestore syncs to remove "dirty" data from the journals
> without interfering too much with concurrent writes).
> I believe you must have a high filestore sync value to hope to benefit
> from this (we use 30s).
> At the very least the long pause will eventually allow VMs to move data
> to disk regularly instead of being nearly frozen.
>
> Note that your pgs are more than 10G each, if the OSDs can't stop a
> backfill before finishing transferring the current pg this won't help (I
> assume backfills go through journals and they probably won't be able to
> act as write-back caches anymore as one PG will be enough to fill them up).
>
> Best regards,
>
> Lionel

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.0.2
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJV8gIbCRDmVDuy+mK58QAAGOgQAMLGgbrgsHF2n9ZVGxol
4X1jezsXAjrPc19U38u8JLv1kVsSal6MBh+uSt1O6RnHWT+fMYOh1knPSYgl
aWvjYP9yJ+yVnWtuz5YxRI45WJ8XvJ8V7FPUYLRxSId7IX4EToupUf30AjdD
KZfjfLgpNKz98UMmFBRporTsvIX1cHGVtN7tiqhAtRPQYMhgXCA2pyqUFkhJ
H86287DZnnXrlDOsT7e+0Gel+eYKjUF7QsUYKCUMVx1Mj5oAm9gC0ZIm+icS
YIeUOzIO8LGV3YXHWmUQClzV9w0uQ7CBvvLoCBbFjvQOgQizsOUpgXv818Fr
Fp6ihpoNKDGaQ7lylLmT8Yu4Rf+JFQn3xfLBE0lPg41CkI8/MQIQsyYLlr5D
Pdd1msxy14Y1lvRbwsNnn+ICzvz/YhbuwtTSVFT+EnRSwc+fkRhKi1ipB1Zx
5zyvVI0ge8SRIelXYfueBmC/LCxjYp9ntfSSQujxlVejgUCxmG3HTd3TvBcn
SdyA7F5sQOpOSK+Hc/eRGwxYgWq4r/jd3TJQt6F2qRHi/nx2K4oFFv6r6SgT
zkDdZewlE+kVx8GkKnB4h1xI3DhGsIyPaS7rCSqy1DrMmxUSFFGgYto7umok
s5cpOeq35owbiv9Da8t3MCzoZvYfhuXCitWn+Jl69v5vfGHm6ha4A59mcigz
S9DN
=6xla
-----END PGP SIGNATURE-----
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com