Re: Problems with statistics after upgrade to luminous

Sage Weil <sweil@xxxxxxxxxx> · Mon, 10 Jul 2017 17:44:54 +0000 (UTC)

On Mon, 10 Jul 2017, Gregory Farnum wrote:
> On Mon, Jul 10, 2017 at 12:57 AM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote:
> 
>       I need a little help with fixing some errors I am having.
> 
>       After upgrading from Kraken im getting incorrect values reported
>       on
>       placement groups etc. At first I thought it is because I was
>       changing
>       the public cluster ip address range and modifying the monmap
>       directly.
>       But after deleting and adding a monitor this ceph daemon dump is
>       still
>       incorrect.
> 
> 
> 
> 
>       ceph daemon mon.a perf dump cluster
>       {
>           "cluster": {
>               "num_mon": 3,
>               "num_mon_quorum": 3,
>               "num_osd": 6,
>               "num_osd_up": 6,
>               "num_osd_in": 6,
>               "osd_epoch": 3842,
>               "osd_bytes": 0,
>               "osd_bytes_used": 0,
>               "osd_bytes_avail": 0,
>               "num_pool": 0,
>               "num_pg": 0,
>               "num_pg_active_clean": 0,
>               "num_pg_active": 0,
>               "num_pg_peering": 0,
>               "num_object": 0,
>               "num_object_degraded": 0,
>               "num_object_misplaced": 0,
>               "num_object_unfound": 0,
>               "num_bytes": 0,
>               "num_mds_up": 1,
>               "num_mds_in": 1,
>               "num_mds_failed": 0,
>               "mds_epoch": 816
>           } 
> 
>       }
> 
> 
> Huh, I didn't know that existed.
> 
> So, yep, most of those values aren't updated any more. From a grep, you can
> still trust:
> num_mon
> num_mon_quorum
> num_osd
> num_osd_up
> num_osd_in
> osd_epoch
> num_mds_up
> num_mds_in
> num_mds_failed
> mds_epoch
> 
> We might be able to keep updating the others when we get reports from the
> manager, but it'd be simpler to just rip them out — I don't think the admin
> socket is really the right place to get cluster summary data like this.
> Sage, any thoughts?

These were added to fill a gap when operators are collecting everything 
via collectd or similar.  Getting the same cluster-level data from 
multiple mons is redundant but it avoids having to code up a separate 
collector that polls the CLI or something.

I suspect once we're funneling everything through a mgr module this 
problem will go away and we can remove this.  Until then, these are easy 
to fix by populating from PGMapDigest... my vote is we do that!

sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com