Re: Squid 19.2.0 balancer causes restful requests to be lost

Ernesto Puerta <epuertat@xxxxxxxxxx> · Wed, 6 Nov 2024 13:48:14 +0100

> We use the restful API for monitoring (using the Ceph for Zabbix Agent 2
> plugin, as Zabbix is the over-arching monitoring platform in the data
> centre)

Chris, just FYI: the "restful" mgr module was deprecated 4 years ago [1]
and will be removed in v20 (Tentacle). [2] Something similar will happen
with the "zabbix" mgr module. [3]

[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/LBKLNXH7UQL7TLFU5G52Y2SYVME4RS6P/
[2] https://github.com/ceph/ceph/pull/57299
[3] https://docs.ceph.com/en/squid/mgr/zabbix/#zabbix-module

Kind Regards,
Ernesto

On Tue, Nov 5, 2024 at 9:19 PM Laura Flores <lflores@xxxxxxxxxx> wrote:

> Thanks for sending this in, Chris.
>
> On Fri, Nov 1, 2024 at 9:47 AM Chris Palmer <chris.palmer@xxxxxxxxx>
> wrote:
>
> > Hi Laura
> >
> > Logged as https://tracker.ceph.com/issues/68801 It is pretty much as I
> > thought, except that the requests are not actually being lost while the
> > balancer is running, but are delayed until it has finished (by which time
> > the client has timed out). Anything more I can do just let me know.
> > Regards, Chris
> > On 31/10/2024 15:12, Laura Flores wrote:
> >
> > Hi Chris,
> >
> > As other users have pointed out, we are fixing an issue tracked
> inhttps://tracker.ceph.com/issues/68657 that seems related to what you're
> > experiencing. However, can you raise a new tracker describing your
> problem
> > so we can confirm?
> >
> > Can you please include:
> > 1. Steps to reproduce (including any commands you are performing to
> invoke
> > the restful api)
> > 2. MGR logs with `ceph config set mgr.* debug_mgr 20` and `ceph config
> set
> > mgr mgr/balancer/log_level debug`
> >
> > Thanks,
> > Laura
> >
> > On Wed, Oct 30, 2024 at 7:24 AM Chris Palmer <chris.palmer@xxxxxxxxx> <
> chris.palmer@xxxxxxxxx> wrote:
> >
> >
> > I've just upgraded a test cluster from 18.2.4 to 19.2.0.  Package
> > install on centos 9 stream. Very smooth upgrade. Only one problem so
> far...
> >
> > The MGR restful api calls work fine. EXCEPT whenever the balancer kicks
> > in to find any new plans. During the few seconds that the balancer takes
> > to run, all REST calls seem to be completely dropped. The MGR log file
> > normally logs the POST requests, but the ones during these few seconds
> > don't appear at all. This causes our monitoring to keep raising alarms.
> >
> > The cluster is in a completely stable state, HEALTH_OK, very little
> > activity, just the occasional scrubs.
> >
> > We use the restful API for monitoring (using the Ceph for Zabbix Agent 2
> > plugin, as Zabbix is the over-arching monitoring platform in the data
> > centre). I haven't yet checked the memory leak problems that we (like
> > many) were having, because I have been chasing this new problem.
> >
> > The problem is quite repeatable. To diagnose I use the zabbix_get
> > utility to query every second. Whenever the MGR log file shows the
> > balancer kick in the REST requests time out (after 3 seconds - not sure
> > whether the utility or the MGR is timing them out - I suspect the
> > utility). They normally complete after a small fraction of a second.
> > With the balancer disabled the REST interface works reliably again.
> >
> > The problem does not occur pre-squid.
> >
> > Anyone any ideas, or shall I raise a bug?
> >
> > Thanks, Chris
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> >
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage <https://ceph.io>
>
> Chicago, IL
>
> lflores@xxxxxxx | lflores@xxxxxxxxxx <lflores@xxxxxxxxxx>
> M: +17087388804
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx