Re: Squid 19.2.0 balancer causes restful requests to be lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Laura

Logged as https://tracker.ceph.com/issues/68801 It is pretty much as I thought, except that the requests are not actually being lost while the balancer is running, but are delayed until it has finished (by which time the client has timed out). Anything more I can do just let me know. Regards, Chris
On 31/10/2024 15:12, Laura Flores wrote:
Hi Chris,

As other users have pointed out, we are fixing an issue tracked in
https://tracker.ceph.com/issues/68657 that seems related to what you're
experiencing. However, can you raise a new tracker describing your problem
so we can confirm?

Can you please include:
1. Steps to reproduce (including any commands you are performing to invoke
the restful api)
2. MGR logs with `ceph config set mgr.* debug_mgr 20` and `ceph config set
mgr mgr/balancer/log_level debug`

Thanks,
Laura

On Wed, Oct 30, 2024 at 7:24 AM Chris Palmer<chris.palmer@xxxxxxxxx> wrote:

I've just upgraded a test cluster from 18.2.4 to 19.2.0.  Package
install on centos 9 stream. Very smooth upgrade. Only one problem so far...

The MGR restful api calls work fine. EXCEPT whenever the balancer kicks
in to find any new plans. During the few seconds that the balancer takes
to run, all REST calls seem to be completely dropped. The MGR log file
normally logs the POST requests, but the ones during these few seconds
don't appear at all. This causes our monitoring to keep raising alarms.

The cluster is in a completely stable state, HEALTH_OK, very little
activity, just the occasional scrubs.

We use the restful API for monitoring (using the Ceph for Zabbix Agent 2
plugin, as Zabbix is the over-arching monitoring platform in the data
centre). I haven't yet checked the memory leak problems that we (like
many) were having, because I have been chasing this new problem.

The problem is quite repeatable. To diagnose I use the zabbix_get
utility to query every second. Whenever the MGR log file shows the
balancer kick in the REST requests time out (after 3 seconds - not sure
whether the utility or the MGR is timing them out - I suspect the
utility). They normally complete after a small fraction of a second.
With the balancer disabled the REST interface works reliably again.

The problem does not occur pre-squid.

Anyone any ideas, or shall I raise a bug?

Thanks, Chris
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux