Re: About scrub and deep-scrub

Eugen Block <eblock@xxxxxx> · Mon, 07 Oct 2024 09:28:57 +0000

Well, if you have more PGs, the deep-scrubs per PG finish faster, but  
you have more PGs to scrub. But that way you can stretch the  
deep-scrub interval. As I said, I can not recommend to increase them  
in general. Overloading OSDs with too many PGs can have a negative  
effect as well. If you can, I'd probably rather add more (smaller)  
OSDs to spread the PGs.

Zitat von Phong Tran Thanh <tranphong079@xxxxxxxxx>:

Hi Eugen

Adding PG to the cluster only helps to reduce PG size and reduce scrub time
is right? >150 PG and <200 in osd i think it's good number

My cluster I/O 2-3GB/s real time in 24h IOPS 2K. If choosing to increase PG
is it a good choice?

Vào Th 2, 7 thg 10, 2024 vào lúc 15:49 Eugen Block <eblock@xxxxxx> đã
viết:

So your PGs for pool 52 have a size of around 320 GB, that is quite a
lot and not surprising that deep-scrubs take a long time. At the same
time, your PGs per OSD are already > 150. We had a similar situation
on a customer cluster this year as well, also with 12 TB drives. We
decided to increase the pg_num anyway to reduce the pg sizes. They
currently have around 380 PGs per large OSD (they have lots of smaller
OSDs as well) which still works fine. But they're using it as an
archive, so the IO is not very high. If you would decide to split PGs,
keep in mind to increase mon_max_pg_per_osd and
osd_max_pg_per_osd_hard_ratio as well. I can't explicitly recommend to
double your PGs per OSD as I'm not familiar with your cluster, the
load etc. It's just something to think about.
Doubling the PG count would reduce the PG size to around 160 GB, which
is still a lot, but I probably wouldn't go further than that.
The OSD utilization is only around 40%, in this case a cluster with
more (smaller) OSDs would probably have made more sense.

Zitat von Phong Tran Thanh <tranphong079@xxxxxxxxx>:

> Hi Eugen
>
> Can you see and give me some advice, number of PG and PG size..
>
> ID   CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP     META
>   AVAIL    %USE   VAR   PGS  STATUS
>   2    hdd  10.98349   1.00000   11 TiB  4.5 TiB  4.2 TiB  5.0 MiB   12
GiB
>  6.5 TiB  40.77  1.00  182      up
>  17    hdd  10.98349   1.00000   11 TiB  4.8 TiB  4.6 TiB   22 MiB   14
GiB
>  6.1 TiB  44.16  1.08  196      up
>  32    hdd  10.98349   1.00000   11 TiB  4.3 TiB  4.0 TiB   37 MiB   12
GiB
>  6.7 TiB  38.80  0.95  173      up
>  47    hdd  10.98349   1.00000   11 TiB  4.2 TiB  3.9 TiB  655 KiB   11
GiB
>  6.8 TiB  38.16  0.93  184      up
>  60    hdd  10.98349   1.00000   11 TiB  4.3 TiB  4.0 TiB   19 MiB   12
GiB
>  6.6 TiB  39.47  0.96  176      up
>  74    hdd  10.98349   1.00000   11 TiB  4.2 TiB  3.9 TiB   28 MiB   12
GiB
>  6.8 TiB  38.10  0.93  187      up
>  83    hdd  10.98349   1.00000   11 TiB  4.8 TiB  4.5 TiB  1.9 MiB   14
GiB
>  6.2 TiB  43.47  1.06  180      up
>  96    hdd  10.98349   1.00000   11 TiB  4.3 TiB  4.0 TiB   38 MiB   12
GiB
>  6.7 TiB  38.80  0.95  181      up
> 110    hdd  10.98349   1.00000   11 TiB  4.5 TiB  4.2 TiB  4.3 MiB   13
GiB
>  6.5 TiB  40.79  1.00  174      up
> 123    hdd  10.98349   1.00000   11 TiB  4.2 TiB  3.9 TiB  1.9 MiB   13
GiB
>  6.8 TiB  38.11  0.93  173      up
> 136    hdd  10.98349   1.00000   11 TiB  4.3 TiB  4.0 TiB   43 MiB   12
GiB
>  6.6 TiB  39.46  0.96  179      up
> .....
>
> PG      OBJECTS  DEGRADED  MISPLACED  UNFOUND  BYTES         OMAP_BYTES*
>  OMAP_KEYS*  LOG   LOG_DUPS  STATE                        SINCE
> 52.0      80121         0          0        0  323105248            0
>         0  1747      3000                 active+clean    94m
> 52.1      79751         0          0        0  321711556766            0
>         0  1727      3000                 active+clean    21h
> 52.2      80243         0          0        0  323711878626            0
>         0  1618      3000                 active+clean    30h
> 52.3      79892         0          0        0  322166010020            0
>         0  1627      3000                 active+clean     9h
> 52.4      80267         0          0        0  323708219486            0
>         0  1658      3000                 active+clean     5h
> 52.5      79996         0          0        0  322331504454            0
>         0  1722      3000                 active+clean    18h
> 52.6      80190         0          0        0  323460394402            0
>         0  1759      3000                 active+clean    15h
> 52.7      79998         0          0        0  322769143546            0
>         0  1720      3000                 active+clean    26h
> 52.8      80292         0          0        0  323932173152            0
>         0  1691      3000                 active+clean    21h
> 52.9      79808         0          0        0  321910742702            0
>         0  1675      3000                 active+clean     7h
> 52.a      79751         0          0        0  321578061334            0
>         0  1822      3000                 active+clean    26h
> 52.b      80287         0          0        0  323905164642            0
>         0  1793      3000                 active+clean     6h
> ....
> Thanks Eugen
>
> Vào Th 2, 7 thg 10, 2024 vào lúc 14:45 Eugen Block <eblock@xxxxxx> đã
> viết:
>
>> Hi,
>>
>> disabling scrubbing in general is bad idea, because you won't notice
>> any data corruption except when it might be too late.
>> But you can fine tune scrubbing, for example increase the interval to
>> allow fewer scrubs to finish in a longer interval. Or if the client
>> load is mainly during business hours, adjust osd_scrub_begin_hour and
>> osd_scrub_end_hour to your needs.
>> And it also depends on the size of your PGs. The larger the PGs are,
>> the longer a deep-scrub would take. So splitting PGs can have a quite
>> positive effect in general. Inspect 'ceph osd df' output as well as
>> 'ceph pg ls' (BYTES column), you can also share it here if you need
>> assistance interpreting those values.
>>
>> Regards,
>> Eugen
>>
>> Zitat von Phong Tran Thanh <tranphong079@xxxxxxxxx>:
>>
>> > Hi ceph users!
>> >
>> > How about the disable scrub and deep-scrub, i want to disable it
because
>> of
>> > its effect on many I/O of my cluster.
>> > If I disable scrub how will it affect my cluster?
>> > If enabled, scrubbing is not complete and takes a long time.
>> >
>> >
>> > Thank
>> > Skype: tranphong079
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
>
> --
> Trân trọng,
>
----------------------------------------------------------------------------
>
> *Tran Thanh Phong*
>
> Email: tranphong079@xxxxxxxxx
> Skype: tranphong079

--
Trân trọng,
----------------------------------------------------------------------------

*Tran Thanh Phong*

Email: tranphong079@xxxxxxxxx
Skype: tranphong079

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx