Re: ceph df output

Sage Weil <sage@xxxxxxxxxxxx> · Thu, 11 Oct 2018 12:18:31 +0000 (UTC)

On Thu, 11 Oct 2018, Ugis wrote:
> Yes, ceph df GLOBAL section supplement with a view based on the CRUSH
> root(s) and the device classes - that would be useful to get better
> insight what is going on in cluster.

I've added a card for this: 
	https://trello.com/c/4EKu1Aie/411-mon-make-df-show-available-space-by-crush-root-and-device-class

Thanks!
sage

> 
> Uģis
> ceturtd., 2018. g. 11. okt., plkst. 01:15 — lietotājs Kyle Bader
> (<kyle.bader@xxxxxxxxx>) rakstīja:
> >
> > +1 for output that breaks out device classes
> >
> > On Wed, Oct 10, 2018 at 11:27 Sage Weil <sage@xxxxxxxxxxxx> wrote:
> >>
> >> On Wed, 10 Oct 2018, Sage Weil wrote:
> >> > On Wed, 10 Oct 2018, Ugis wrote:
> >> > > Ok, that was useful.
> >> > > I guess explanation then could be that we have device classes for HDD
> >> > > and SSD  https://ceph.com/community/new-luminous-crush-device-classes/
> >> >
> >> > Bingo.  :)
> >> >
> >> > >  that adds to GLOBAL AVAIL
> >> > > OSDs are ballanced more or less evenly USE% = 54-70%.
> >> > >
> >> > > So more reliable number is per pool AVAIL as if that runs out - pool
> >> > > cannot expand.
> >> >
> >> > Right.  If you can point to places in the documentation or in how the
> >> > output is presented that can be improved to avoid this confusion, let us
> >> > know!
> >>
> >> Hmm, I wonder if that GLOBAL section could be supplemented with a view
> >> based on the CRUSH root(s) (usually there is just one, 'default'), and the
> >> device classes, and breaks down usage that way.
> >>
> >> sage
> >>
> >>
> >> >
> >> > sage
> >> >
> >> > >
> >> > > Ugis
> >> > >
> >> > >
> >> > > trešd., 2018. g. 10. okt., plkst. 17:49 — lietotājs Sage Weil
> >> > > (<sage@xxxxxxxxxxxx>) rakstīja:
> >> > > >
> >> > > > On Wed, 10 Oct 2018, Ugis wrote:
> >> > > > > Hi,
> >> > > > >
> >> > > > > ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable)
> >> > > > >
> >> > > > > Cannot understand why ceph shows only 17TiB "MAX AVAIL" for pools
> >> > > > > while GLOBAL AVAIL=105TiB.
> >> > > > > What I expect is that for pools with replica count 3 MAX AVAIL should
> >> > > > > be roughly GLOBAL AVAIL/3 ~ 35TiB
> >> > > > >
> >> > > > > In excerpt below pool xxxxxxxxxxxx has replica count 2, the rest have 3
> >> > > >
> >> > > > The 2x vs 3x is the reason you see 17 TiB vs 25 TiB.  The reason it is so
> >> > > > much lower than global avail is harder to see from the info below, though.
> >> > > > The per-pool avail is calculated by looking at the OSDs touched by that
> >> > > > pool and calculating which will fill up first (as that practially limits
> >> > > > how much you can store), while global avail just adds up free space
> >> > > > everywhere.  So maybe your osds are imbalanced, or maybe you have rules
> >> > > > for those pools that only point to a subset of the OSDs in the system?
> >> > > >
> >> > > > sage
> >> > > >
> >> > > >
> >> > > >  >
> >> > > > > # ceph df detail
> >> > > > > GLOBAL:
> >> > > > >     SIZE        AVAIL       RAW USED     %RAW USED     OBJECTS
> >> > > > >     250 TiB     105 TiB      145 TiB         57.98     12.71 M
> >> > > > > POOLS:
> >> > > > >     NAME                           ID     QUOTA OBJECTS     QUOTA
> >> > > > > BYTES     USED        %USED     MAX AVAIL     OBJECTS     DIRTY
> >> > > > > READ        WRITE       RAW USED
> >> > > > >     xxxxxxxxxxxx                   9      N/A               N/A
> >> > > > >      1.0 GiB         0        25 TiB         298        298       85
> >> > > > > KiB     9.9 KiB      2.0 GiB
> >> > > > >     ssssssss                       17     N/A               N/A
> >> > > > >       10 TiB     37.72        17 TiB     2621445      2.62 M     471
> >> > > > > MiB     367 MiB       30 TiB
> >> > > > >     eeeeeeeee                      47     N/A               N/A
> >> > > > >        532 B         0        17 TiB           1          1          1
> >> > > > > B         9 B      1.6 KiB
> >> > > > >     yyyyyyyyyyyyyyyyy              48     N/A               N/A
> >> > > > >      2.1 TiB     11.36        17 TiB      554933     554.9 k     194
> >> > > > > MiB      39 MiB      6.4 TiB
> >> > > > >     ......
> >> > > > >
> >> > > > > Any ideas where space is hidden or I wrongly interpret GLOBAL AVAIL?
> >> > > > >
> >> > > > > Best ragards
> >> > > > > Ugis
> >> > > > >
> >> > > > >
> >> > >
> >> > >
> 
>