On Thu, 11 Oct 2018, Ugis wrote: > Yes, ceph df GLOBAL section supplement with a view based on the CRUSH > root(s) and the device classes - that would be useful to get better > insight what is going on in cluster. I've added a card for this: https://trello.com/c/4EKu1Aie/411-mon-make-df-show-available-space-by-crush-root-and-device-class Thanks! sage > > Uģis > ceturtd., 2018. g. 11. okt., plkst. 01:15 — lietotājs Kyle Bader > (<kyle.bader@xxxxxxxxx>) rakstīja: > > > > +1 for output that breaks out device classes > > > > On Wed, Oct 10, 2018 at 11:27 Sage Weil <sage@xxxxxxxxxxxx> wrote: > >> > >> On Wed, 10 Oct 2018, Sage Weil wrote: > >> > On Wed, 10 Oct 2018, Ugis wrote: > >> > > Ok, that was useful. > >> > > I guess explanation then could be that we have device classes for HDD > >> > > and SSD https://ceph.com/community/new-luminous-crush-device-classes/ > >> > > >> > Bingo. :) > >> > > >> > > that adds to GLOBAL AVAIL > >> > > OSDs are ballanced more or less evenly USE% = 54-70%. > >> > > > >> > > So more reliable number is per pool AVAIL as if that runs out - pool > >> > > cannot expand. > >> > > >> > Right. If you can point to places in the documentation or in how the > >> > output is presented that can be improved to avoid this confusion, let us > >> > know! > >> > >> Hmm, I wonder if that GLOBAL section could be supplemented with a view > >> based on the CRUSH root(s) (usually there is just one, 'default'), and the > >> device classes, and breaks down usage that way. > >> > >> sage > >> > >> > >> > > >> > sage > >> > > >> > > > >> > > Ugis > >> > > > >> > > > >> > > trešd., 2018. g. 10. okt., plkst. 17:49 — lietotājs Sage Weil > >> > > (<sage@xxxxxxxxxxxx>) rakstīja: > >> > > > > >> > > > On Wed, 10 Oct 2018, Ugis wrote: > >> > > > > Hi, > >> > > > > > >> > > > > ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable) > >> > > > > > >> > > > > Cannot understand why ceph shows only 17TiB "MAX AVAIL" for pools > >> > > > > while GLOBAL AVAIL=105TiB. > >> > > > > What I expect is that for pools with replica count 3 MAX AVAIL should > >> > > > > be roughly GLOBAL AVAIL/3 ~ 35TiB > >> > > > > > >> > > > > In excerpt below pool xxxxxxxxxxxx has replica count 2, the rest have 3 > >> > > > > >> > > > The 2x vs 3x is the reason you see 17 TiB vs 25 TiB. The reason it is so > >> > > > much lower than global avail is harder to see from the info below, though. > >> > > > The per-pool avail is calculated by looking at the OSDs touched by that > >> > > > pool and calculating which will fill up first (as that practially limits > >> > > > how much you can store), while global avail just adds up free space > >> > > > everywhere. So maybe your osds are imbalanced, or maybe you have rules > >> > > > for those pools that only point to a subset of the OSDs in the system? > >> > > > > >> > > > sage > >> > > > > >> > > > > >> > > > > > >> > > > > # ceph df detail > >> > > > > GLOBAL: > >> > > > > SIZE AVAIL RAW USED %RAW USED OBJECTS > >> > > > > 250 TiB 105 TiB 145 TiB 57.98 12.71 M > >> > > > > POOLS: > >> > > > > NAME ID QUOTA OBJECTS QUOTA > >> > > > > BYTES USED %USED MAX AVAIL OBJECTS DIRTY > >> > > > > READ WRITE RAW USED > >> > > > > xxxxxxxxxxxx 9 N/A N/A > >> > > > > 1.0 GiB 0 25 TiB 298 298 85 > >> > > > > KiB 9.9 KiB 2.0 GiB > >> > > > > ssssssss 17 N/A N/A > >> > > > > 10 TiB 37.72 17 TiB 2621445 2.62 M 471 > >> > > > > MiB 367 MiB 30 TiB > >> > > > > eeeeeeeee 47 N/A N/A > >> > > > > 532 B 0 17 TiB 1 1 1 > >> > > > > B 9 B 1.6 KiB > >> > > > > yyyyyyyyyyyyyyyyy 48 N/A N/A > >> > > > > 2.1 TiB 11.36 17 TiB 554933 554.9 k 194 > >> > > > > MiB 39 MiB 6.4 TiB > >> > > > > ...... > >> > > > > > >> > > > > Any ideas where space is hidden or I wrongly interpret GLOBAL AVAIL? > >> > > > > > >> > > > > Best ragards > >> > > > > Ugis > >> > > > > > >> > > > > > >> > > > >> > > > >