Re: PSA: sqlite3 databases now available for ceph-mgr modules

kefu chai <tchaikov@xxxxxxxxx> · Mon, 21 Jun 2021 17:48:47 +0800

On Fri, Jun 18, 2021 at 11:53 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
>
> Hi Kefu,
>
> On Thu, Jun 17, 2021 at 9:24 PM kefu chai <tchaikov@xxxxxxxxx> wrote:
> >
> > On Wed, Jun 16, 2021 at 10:23 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
> > >
> > > Introduced by [1] for Quincy release. This builds on work in [2] to
> > > add RADOS-backed sqlite3 support to Ceph (available in Pacific).
> > >
> > > The MgrModule API for accessing your module's database is introduced
> > > in [3]. An example of a module ("devicehealth") using the API can be
> > > seen in [4].
> > >
> > > Please let me know if you have any questions or feedback.
> >
> >
> > Hi Patrick,
> >
> > my concern is that, without carefully planning on the segmentation of
> > the pool for storing the healthy data and the pools being monitored,
> > we could interfere with the system being monitored by mutating its
> > status.
> >
> > for instance, if a cluster is experiencing large-scale slow ops, and
> > pumping lots of warning messages and/or structured performance related
> > metrics, some mgr module might want to collect this information from
> > the health monitoring subsystem, and persist them into the sqlite3
> > database. but it is in turn backed by the same cluster. without
> > carefully planning, the objects stored in .mgr pool could be mapped to
> > the same set of OSDs and monitors which are suffering from the
> > performance issue. in the worst case, this could in turn even worsen
> > the situation. but to allocate dedicated OSDs and create a CRUSH map
> > picking them just for the .mgr pool might be difficult or overkill
> > from the maintainability point of view.
> >
> > we actually had the same issue when adding the cluster log back to OSD
> > for recording the slow requests. the large amount of clog puts more
> > burden on the shoulder of the monitors. if the slow requests is caused
> > by monitor, these clogs actually in turn slow down the monitors
> > further.
> >
> > shall we switch to a (local) backup sqlite backend if we identify a
> > performance issue, and restore / backfill the records once the issue
> > is resolved?
>
> Thanks for bringing this up. I think it would be reasonable to decide
> this depending on what the mgr module is doing. For example, I think
> devicehealth and snap_schedule are innocuous enough that we don't need
> to give special consideration for the system potentially being under
> load. Also these modules' mutations of the databases do not depend on
> the cluster state, healthy or degraded. OTOH, a module that is

okay, that's a relief. just for future developers, we should not use
the sqlite backend for storing the alerts created when the whole
cluster is not healthy.

> collecting large streams of data into the database might first ingest
> that data into a local in-memory database and only backup [1] that
> in-memory database to RADOS when the cluster is healthy. If the
> database is very large then a backup would not be desirable as the
> in-memory database would be too large. In that case I would suggest
> streaming batch updates in large transactions.

thanks. glad that we have a plan B in that case.

>
> What do you think?
>
> [1] https://www.sqlite.org/backup.html
>
> --
> Patrick Donnelly, Ph.D.
> He / Him / His
> Principal Software Engineer
> Red Hat Sunnyvale, CA
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>

-- 
Regards
Kefu Chai
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx