Re: Squid 19.2.0 balancer causes restful requests to be lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for sending this in, Chris.

On Fri, Nov 1, 2024 at 9:47 AM Chris Palmer <chris.palmer@xxxxxxxxx> wrote:

> Hi Laura
>
> Logged as https://tracker.ceph.com/issues/68801 It is pretty much as I
> thought, except that the requests are not actually being lost while the
> balancer is running, but are delayed until it has finished (by which time
> the client has timed out). Anything more I can do just let me know.
> Regards, Chris
> On 31/10/2024 15:12, Laura Flores wrote:
>
> Hi Chris,
>
> As other users have pointed out, we are fixing an issue tracked inhttps://tracker.ceph.com/issues/68657 that seems related to what you're
> experiencing. However, can you raise a new tracker describing your problem
> so we can confirm?
>
> Can you please include:
> 1. Steps to reproduce (including any commands you are performing to invoke
> the restful api)
> 2. MGR logs with `ceph config set mgr.* debug_mgr 20` and `ceph config set
> mgr mgr/balancer/log_level debug`
>
> Thanks,
> Laura
>
> On Wed, Oct 30, 2024 at 7:24 AM Chris Palmer <chris.palmer@xxxxxxxxx> <chris.palmer@xxxxxxxxx> wrote:
>
>
> I've just upgraded a test cluster from 18.2.4 to 19.2.0.  Package
> install on centos 9 stream. Very smooth upgrade. Only one problem so far...
>
> The MGR restful api calls work fine. EXCEPT whenever the balancer kicks
> in to find any new plans. During the few seconds that the balancer takes
> to run, all REST calls seem to be completely dropped. The MGR log file
> normally logs the POST requests, but the ones during these few seconds
> don't appear at all. This causes our monitoring to keep raising alarms.
>
> The cluster is in a completely stable state, HEALTH_OK, very little
> activity, just the occasional scrubs.
>
> We use the restful API for monitoring (using the Ceph for Zabbix Agent 2
> plugin, as Zabbix is the over-arching monitoring platform in the data
> centre). I haven't yet checked the memory leak problems that we (like
> many) were having, because I have been chasing this new problem.
>
> The problem is quite repeatable. To diagnose I use the zabbix_get
> utility to query every second. Whenever the MGR log file shows the
> balancer kick in the REST requests time out (after 3 seconds - not sure
> whether the utility or the MGR is timing them out - I suspect the
> utility). They normally complete after a small fraction of a second.
> With the balancer disabled the REST interface works reliably again.
>
> The problem does not occur pre-squid.
>
> Anyone any ideas, or shall I raise a bug?
>
> Thanks, Chris
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage <https://ceph.io>

Chicago, IL

lflores@xxxxxxx | lflores@xxxxxxxxxx <lflores@xxxxxxxxxx>
M: +17087388804
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux