Re: help

Amudhan P <amudhan83@xxxxxxxxx> · Fri, 30 Aug 2019 16:05:29 +0530

my cluster health status went to warning mode only after running mkdir of 1000's of folders with multiple subdirectories. if this has made OSD crash does it really takes that long to heal empty directories.

On Fri, Aug 30, 2019 at 3:12 PM Janne Johansson <icepic.dz@xxxxxxxxx> wrote:
Den fre 30 aug. 2019 kl 10:49 skrev Amudhan P <amudhan83@xxxxxxxxx>:
After leaving 12 hours time now cluster status is healthy, but why did it take such a long time for backfill?How do I fine-tune? if in case of same kind error pop-out again.

The backfilling is taking a while because max_backfills = 1 and you only have 3 OSD's total so the backfilling per PG has to have for the previous PG backfill to complete.

That setting is the main tuning, EXCEPT it will be at the expense of client traffic, so you can allow a large(r) amount of parallel recoveries and backfills, but of course it will be more noticeable for your client IO if you do.

Lastly, getting backfill MB/s up is "best" done by having a huge amount of OSD hosts, and fast OSD drives and let the cluster work in parallel, as opposed to having 3 drives only because you will see no parallelism on that setup (if you have size=3 all OSDs are always involved in every single PG to recover) and you will just see overhead compare to what disk-read and disk-write would give on a single drive.

-- 
May the most significant bit of your life be positive.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx