Re: Ceph recovery kill VM's even with the smallest priority

Gregory Farnum <gfarnum@xxxxxxxxxx> · Thu, 29 Mar 2018 16:22:57 +0000

On Thu, Mar 29, 2018 at 7:27 AM Damian Dabrowski <scooty96@xxxxxxxxx> wrote:
Hello,

Few days ago I had very strange situation.

I had to turn off few OSDs for a while. So I've set flags:noout,

nobackfill, norecover and then turned off selected OSDs.

All was ok, but when I started these OSDs again all VMs went down due

to recovery process(even when recovery priority was very low).

So you forbade the OSDs from doing any recovery work, but then you turned on old ones that required recovery work to function properly?

And your cluster stopped functioning?

There's more important config values:

    "osd_recovery_threads": "1",

    "osd_recovery_thread_timeout": "30",

    "osd_recovery_thread_suicide_timeout": "300",

    "osd_recovery_delay_start": "0",

    "osd_recovery_max_active": "1",

    "osd_recovery_max_single_start": "5",

    "osd_recovery_max_chunk": "8388608",

    "osd_client_op_priority": "63",

    "osd_recovery_op_priority": "1",

    "osd_recovery_op_warn_multiple": "16",

    "osd_backfill_full_ratio": "0.85",

    "osd_backfill_retry_interval": "10",

    "osd_backfill_scan_min": "64",

    "osd_backfill_scan_max": "512",

    "osd_kill_backfill_at": "0",

    "osd_max_backfills": "1",

I don't know why ceph started recovery process if there was

norecovery&nobackfill flags enabled but the fact is that it killed all

VMs.

Did it actually start recovering? Or you just saw client IO pause?
I confess I don’t know what the behavior will be like with that combined set of flags, but I rather suspect it did what you told it to, and some PGs went down as a result.
-Greg

Next, I've turned off noout, nobackfill, norecover flags and it

started to look better. VM's went back online and recovery process was

still going. I didn't saw performance impact on SSD disks but there

was huge impact on spinners.

Normally %util is about 25%, but during recovery it was nearly 100%.

CPU Load increased on HDD based VMs by ~400%.

iostat fragment(during recovery):

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s

avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdh              0.30     1.00  150.90   36.00 13665.60   954.60

156.45    10.63   56.88   25.60  188.02   5.34  99.80

Now, I'm little lost, I don't know answers for few questions.

1. Why ceph started recovery even if nobackfill&norecovery option was enabled?

2. Why recovery caused much more performance impact when

norecovery&nobackfill options was enabled?

3. Why when norecovery&nobackfill was turned off, cluster started to

look better but %util on HDD disks was so big(while

recovery_op_priority=1 and client_op_priority=63)? 25% is normal,

increased to 100% during recovery?

Cluster information:

ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90)

3x nodes(CPU E5-2630, 32GB RAM, 6xHDD 2TB with SSD journal, 3x SSD 1TB

with NVMe journal), triple replication

I would be very grateful If somebody can help me.

Sorry if I've done something in wrong way - this is my first time

writing on mailing list.

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com