Re: [Ceph] Recovery is very Slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Try to change settings you mentioned to increase recover speed but keep in mind recover speed is also linked to :

- Hardware, NVMe/SSD will recover way faster than HDD
- Object size, in your case they doesn’t seem to be large
- Number of OSD in cluster, here you added an empty OSD so it receives all (if you use replica 3) data. If you had lost a single OSD in a 100 OSD cluster, many of them would receive data.
- If you use replication or erasure coding

Also if you recover ’too’ fast it may decrease QOS for your clients io.

Do you plan to only use 3 OSD? If so, you should maybe use something else than Ceph.


-
Etienne Menguy
etienne.menguy@xxxxxxxx




On 28 Oct 2021, at 09:35, Janne Johansson <icepic.dz@xxxxxxxxx> wrote:



Den tors 28 okt. 2021 kl 09:09 skrev Lokendra Rathour <lokendrarathour@xxxxxxxxx>:
Hi,
we have been trying to test  a scenario on ceph with the following configuration:
 cluster:
    id:     cc0ba1e4-68b9-4237-bc81-40b38455f713
    health: HEALTH_OK
  services:
    mon: 3 daemons, quorum storagenode1,storagenode2,storagenode3 (age 4h)
    mgr: storagenode2(active, since 22h), standbys: storagenode1, storagenode3
    mds: cephfs:1 {0=storagenode1=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 4m), 3 in (since 4h)
    rgw: 3 daemons active (storagenode1.rgw0, storagenode2.rgw0, storagenode3.rgw0)
  task status:
    scrub status:
        mds.storagenode1: idle
  data:
    pools:   7 pools, 169 pgs
    objects: 1.06M objects, 1.3 TiB
    usage:   3.9 TiB used, 9.2 TiB / 13 TiB avail
    pgs:     169 active+clean
  io:
    client:   43 KiB/s wr, 0 op/s rd, 3 op/s wr
    recovery: 154 MiB/s, 98 objects/s
 
We have network links of 10GiG for all the networks used in Ceph. MTU is configured as 9000. But the Transfer rate as can be seen above is max 154 MiB/s which I feel is way low than possible. 

Test Case:
We removed one node and added it back to the Ceph Cluster after reinstalling the OS. During this time of activity, Ceph has around 1.3 TB to rebalance in the newly added node. The time taken in such a case is approximate: 4 hours. 

Considering this as the production-grade setup with all production-grade infra, this time is too much.

Query:
  • Is there a way to optimize the recovery/rebalancing and i/o rate of Ceph?
  • we found a few suggestions on the internet that we can modify the below parameters to achieve a good rate, but is this advisable
    •   osd max backfills, osd recovery max active, osd recovery max single start 
  • we have dedicated 10gig n/w infra so can we have any ideal value to reach max rate of recovery.

Any input would be helpful, we are really blocked here.


If this is one spinning drive receiving data, then those figures look ok. If you instead had a large cluster with more drives, the sum of the recovery traffic would be more if you allow more parallelism. Looking at osd_max_backfills to see how many parallel backfills you will allow and looking at posts and guides like this:
might also help.



--
May the most significant bit of your life be positive.
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux