Re: Squid 19.2.0 balancer causes restful requests to be lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris,

i forgot a small detail we are working on a new REST and gRPC API for Ceph,
Joachim


  +49 89 2152527-21 <https://call.ctrlq.org/+49%2089%202152527-21>

  joachim.kraftmayer@xxxxxxxxx

  www.clyso.com

  Hohenzollernstr. 27, 80801 Munich

Utting | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE275430677



Am Do., 7. Nov. 2024 um 21:50 Uhr schrieb Joachim Kraftmayer <
joachim.kraftmayer@xxxxxxxxx>:

> Hi Chris,
> we have been working on a standalone API for Ceph for a few months, but it
> is not yet complete.
> Which API calls do you need, maybe we have already implemented them or can
> prioritize them?
> We have planned to release the API repository for Cephalocon 2024.
> Maybe we'll see you there, Joachim
>
>   joachim.kraftmayer@xxxxxxxxx
>
>   www.clyso.com
>
>   Hohenzollernstr. 27, 80801 Munich
>
> Utting | HR: Augsburg | HRB: 25866 | USt. ID-Nr.: DE275430677
>
>
>
> Am Do., 7. Nov. 2024 um 11:04 Uhr schrieb Chris Palmer <
> chris.palmer@xxxxxxxxx>:
>
>> Hi Ernesto
>>
>> OK, well that puts a whole different spin on the problem. I'd not seen
>> that (it slightly predates my involvement with ceph), and I was just
>> using the plugin that comes with zabbix so hadn't had any cause to look
>> more deeply at the restful api. (Note that plugin is completely
>> different from the zabbix module supplied by ceph, which I discounted as
>> I knew it was deprecated).
>>
>> It sounds as though the right path would be for the zabbix-supplied ceph
>> plugin to be reworked to use the newer ceph dashboard api, if that is
>> possible. I might take a look at the api and plugin to see what would be
>> involved.
>>
>> Regards, Chris
>>
>>
>>
>> On 06/11/2024 12:48, Ernesto Puerta wrote:
>> > > We use the restful API for monitoring (using the Ceph for Zabbix
>> Agent 2
>> > > plugin, as Zabbix is the over-arching monitoring platform in the data
>> > > centre)
>> >
>> > Chris, just FYI: the "restful" mgr module was deprecated 4 years ago
>> > [1] and will be removed in v20 (Tentacle). [2] Something similar will
>> > happen with the "zabbix" mgr module. [3]
>> >
>> > [1]
>> >
>> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/LBKLNXH7UQL7TLFU5G52Y2SYVME4RS6P/
>> > [2] https://github.com/ceph/ceph/pull/57299
>> > [3] https://docs.ceph.com/en/squid/mgr/zabbix/#zabbix-module
>> >
>> > Kind Regards,
>> > Ernesto
>> >
>> >
>> > On Tue, Nov 5, 2024 at 9:19 PM Laura Flores <lflores@xxxxxxxxxx> wrote:
>> >
>> >     Thanks for sending this in, Chris.
>> >
>> >     On Fri, Nov 1, 2024 at 9:47 AM Chris Palmer
>> >     <chris.palmer@xxxxxxxxx> wrote:
>> >
>> >     > Hi Laura
>> >     >
>> >     > Logged as https://tracker.ceph.com/issues/68801 It is pretty
>> >     much as I
>> >     > thought, except that the requests are not actually being lost
>> >     while the
>> >     > balancer is running, but are delayed until it has finished (by
>> >     which time
>> >     > the client has timed out). Anything more I can do just let me
>> know.
>> >     > Regards, Chris
>> >     > On 31/10/2024 15:12, Laura Flores wrote:
>> >     >
>> >     > Hi Chris,
>> >     >
>> >     > As other users have pointed out, we are fixing an issue tracked
>> >     inhttps://tracker.ceph.com/issues/68657
>> >     <http://tracker.ceph.com/issues/68657> that seems related to what
>> >     you're
>> >     > experiencing. However, can you raise a new tracker describing
>> >     your problem
>> >     > so we can confirm?
>> >     >
>> >     > Can you please include:
>> >     > 1. Steps to reproduce (including any commands you are performing
>> >     to invoke
>> >     > the restful api)
>> >     > 2. MGR logs with `ceph config set mgr.* debug_mgr 20` and `ceph
>> >     config set
>> >     > mgr mgr/balancer/log_level debug`
>> >     >
>> >     > Thanks,
>> >     > Laura
>> >     >
>> >     > On Wed, Oct 30, 2024 at 7:24 AM Chris Palmer
>> >     <chris.palmer@xxxxxxxxx> <chris.palmer@xxxxxxxxx> wrote:
>> >     >
>> >     >
>> >     > I've just upgraded a test cluster from 18.2.4 to 19.2.0. Package
>> >     > install on centos 9 stream. Very smooth upgrade. Only one
>> >     problem so far...
>> >     >
>> >     > The MGR restful api calls work fine. EXCEPT whenever the
>> >     balancer kicks
>> >     > in to find any new plans. During the few seconds that the
>> >     balancer takes
>> >     > to run, all REST calls seem to be completely dropped. The MGR
>> >     log file
>> >     > normally logs the POST requests, but the ones during these few
>> >     seconds
>> >     > don't appear at all. This causes our monitoring to keep raising
>> >     alarms.
>> >     >
>> >     > The cluster is in a completely stable state, HEALTH_OK, very
>> little
>> >     > activity, just the occasional scrubs.
>> >     >
>> >     > We use the restful API for monitoring (using the Ceph for Zabbix
>> >     Agent 2
>> >     > plugin, as Zabbix is the over-arching monitoring platform in the
>> >     data
>> >     > centre). I haven't yet checked the memory leak problems that we
>> >     (like
>> >     > many) were having, because I have been chasing this new problem.
>> >     >
>> >     > The problem is quite repeatable. To diagnose I use the zabbix_get
>> >     > utility to query every second. Whenever the MGR log file shows the
>> >     > balancer kick in the REST requests time out (after 3 seconds -
>> >     not sure
>> >     > whether the utility or the MGR is timing them out - I suspect the
>> >     > utility). They normally complete after a small fraction of a
>> second.
>> >     > With the balancer disabled the REST interface works reliably
>> again.
>> >     >
>> >     > The problem does not occur pre-squid.
>> >     >
>> >     > Anyone any ideas, or shall I raise a bug?
>> >     >
>> >     > Thanks, Chris
>> >     > _______________________________________________
>> >     > ceph-users mailing list -- ceph-users@xxxxxxx
>> >     > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >     >
>> >     >
>> >     >
>> >
>> >     --
>> >
>> >     Laura Flores
>> >
>> >     She/Her/Hers
>> >
>> >     Software Engineer, Ceph Storage <https://ceph.io>
>> >
>> >     Chicago, IL
>> >
>> >     lflores@xxxxxxx | lflores@xxxxxxxxxx <lflores@xxxxxxxxxx>
>> >     M: +17087388804
>> >     _______________________________________________
>> >     ceph-users mailing list -- ceph-users@xxxxxxx
>> >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux