I don't think scrubs can cause this. Do you have autoscaler enabled?
Zitat von Lars Köppel <lars.koeppel@xxxxxxxxxx>:
Hi,
thank you for your response.
I don't think this thread covers my problem, because the OSDs for the
metadata pool fill up at different rates. So I would think this is no
direct problem with the journal.
Because we had earlier problems with the journal I changed some
settings(see below). I already restarted all MDS multiple times but no
change here.
The health warnings regarding cache pressure resolve normally after a
short period of time, when the heavy load on the client ends. Sometimes it
stays a bit longer because an rsync is running and copying data on the
cluster(rsync is not good at releasing the caps).
Could it be a problem if scrubs run most of the time in the background? Can
this block any other tasks or generate new data itself?
Best regards,
Lars
global basic mds_cache_memory_limit
17179869184
global advanced mds_max_caps_per_client
16384
global advanced mds_recall_global_max_decay_threshold
262144
global advanced mds_recall_max_decay_rate
1.000000
global advanced mds_recall_max_decay_threshold
262144
mds advanced mds_cache_trim_threshold
131072
mds advanced mds_heartbeat_grace
120.000000
mds advanced mds_heartbeat_reset_grace
7400
mds advanced mds_tick_interval
3.000000
[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koeppel@xxxxxxxxxx
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai
On Tue, Jun 11, 2024 at 2:05 PM Eugen Block <eblock@xxxxxx> wrote:
Hi,
can you check if this thread [1] applies to your situation? You don't
have multi-active MDS enabled, but maybe it's still some journal
trimming, or maybe misbehaving clients? In your first post there were
health warnings regarding cache pressure and cache size. Are those
resolved?
[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/7U27L27FHHPDYGA6VNNVWGLTXCGP7X23/#VOOV235D4TP5TEOJUWHF4AVXIOTHYQQE
Zitat von Lars Köppel <lars.koeppel@xxxxxxxxxx>:
> Hello everyone,
>
> short update to this problem.
> The zapped OSD is rebuilt and it has now 1.9 TiB (the expected size
~50%).
> The other 2 OSDs are now at 2.8 respectively 3.2 TiB. They jumped up and
> down a lot but the higher one has now also reached 'nearfull' status. How
> is this possible? What is going on?
>
> Does anyone have a solution how to fix this without zapping the OSD?
>
> Best regards,
> Lars
>
>
> [image: ariadne.ai Logo] Lars Köppel
> Developer
> Email: lars.koeppel@xxxxxxxxxx
> Phone: +49 6221 5993580 <+4962215993580>
> ariadne.ai (Germany) GmbH
> Häusserstraße 3, 69115 Heidelberg
> Amtsgericht Mannheim, HRB 744040
> Geschäftsführer: Dr. Fabian Svara
> https://ariadne.ai
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx