Re: Monitoring ceph and prometheus

Lars Marowsky-Bree <lmb@xxxxxxxx> · Thu, 18 May 2017 10:37:16 +0200

On 2017-05-15T13:33:29, John Spray <jspray@xxxxxxxxxx> wrote:

> At the risk of being a bit picky, it's only redundant if prometheus is
> the only thing consuming them.  If the user is also using some mgr
> modules (including things like handy CLI views) that consume the
> stats, it's not redundant at all.  I'd like to keep these stats around
> in the mgr because we're not quite sure yet what kinds of modules
> we'll end up with.

Fair enough. The point that they may wish to gather information at
different frequencies still remains though - a ceph-mgr module may do it
on-demand for certain tasks, event driven, or periodically, prometheus
(or other trending) would want to poll certain counters at various
frequencies, etc.

(e.g., maybe the OSD ones every 10s, SMART every 3h, whatever)

Aligning these would be annoying, and it seems to me that it makes more
sense to allow them to poll independently from the same interfaces.

> Sage's recent change to add the importance thresholds to perf counters
> could be interesting here: we might end up sending everything that's
> "reasonably important" and higher to the mgr for exposing in CLI tools
> etc (I'm thinking of things like the OSD throughput, the MDS number of
> each op per second, etc), while perhaps the really obscure stuff would
> only get collected (into prometheus?) if someone actively chose that.

That's actually somewhat related to how smart classifies. Value,
threshold, type (old-age, pre-fail, we could add a "perf" one).

I take the point - there's also a need for an event-driven channel that
needs to be push by default. (From simple operation completion
notification to "OMFG the disk caught fire.")

I could see those going to ceph-mgr for handling/relaying.

> > So, perhaps exposing this - the dynamic service/target discovery via
> > ceph-mgr to Prometheus, and then having Prometheus pull directly - is a
> > synthesis of both positions?
> It would certainly be ++good build in the service discovery so that
> the user only needs to point prometheus at one place to discover
> everything.  Anything that avoids the need for extra external tools to
> set things up makes me happy.

Yes, I think that'd be great to have. And at least in my head the idea
of where information goes becomes clearer.

Notifications/events go to and through ceph-mgr. ceph-mgr keeps track of
Ceph services. Trending/metrics should IMNSHO be polled directly as
needed.

Regards,
    Lars

-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html