Those osds are intentionally out, yes. (They were drained to be replaced). I have fixed 2 clusters' stats already with this method ... both had up but out osds, and stopping the up/out osd fixed the stats. I opened a tracker for this: https://tracker.ceph.com/issues/48385 -- dan On Thu, Nov 26, 2020 at 8:14 PM Igor Fedotov <ifedotov@xxxxxxx> wrote: > > Also wondering if you have the same "gap" OSDs at different cluster(s) > which show stats improperly? > > > On 11/26/2020 10:08 PM, Dan van der Ster wrote: > > Hey that's it! > > > > I stopped the up but out OSDs (100 and 177), and now the stats are correct! > > > > # ceph df > > RAW STORAGE: > > CLASS SIZE AVAIL USED RAW USED %RAW USED > > hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62 > > TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62 > > > > POOLS: > > POOL ID STORED OBJECTS USED %USED MAX AVAIL > > public 68 2.9 PiB 143.56M 4.3 PiB 84.55 538 TiB > > test 71 29 MiB 6.56k 1.2 GiB 0 269 TiB > > foo 72 1.2 GiB 308 3.6 GiB 0 269 TiB > > > > > > > > On Thu, Nov 26, 2020 at 8:02 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > >> There are a couple gaps, yes: https://termbin.com/9mx1 > >> > >> What should I do? > >> > >> -- dan > >> > >> On Thu, Nov 26, 2020 at 7:52 PM Igor Fedotov <ifedotov@xxxxxxx> wrote: > >>> Does "ceph osd df tree" show stats properly (I mean there are no evident > >>> gaps like unexpected zero values) for all the daemons? > >>> > >>> > >>>> 1. Anyway, I found something weird... > >>>> > >>>> I created a new 1-PG pool "foo" on a different cluster and wrote some > >>>> data to it. > >>>> > >>>> The stored and used are equal. > >>>> > >>>> Thu 26 Nov 19:26:58 CET 2020 > >>>> RAW STORAGE: > >>>> CLASS SIZE AVAIL USED RAW USED %RAW USED > >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.31 > >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.31 > >>>> > >>>> POOLS: > >>>> POOL ID STORED OBJECTS USED %USED MAX AVAIL > >>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49 538 TiB > >>>> test 71 29 MiB 6.56k 29 MiB 0 269 TiB > >>>> foo 72 1.2 GiB 308 1.2 GiB 0 269 TiB > >>>> > >>>> But I tried restarting the relevant three OSDs, and the bytes_used are > >>>> temporarily reported correctly: > >>>> > >>>> Thu 26 Nov 19:27:00 CET 2020 > >>>> RAW STORAGE: > >>>> CLASS SIZE AVAIL USED RAW USED %RAW USED > >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62 > >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62 > >>>> > >>>> POOLS: > >>>> POOL ID STORED OBJECTS USED %USED MAX AVAIL > >>>> public 68 2.9 PiB 143.54M 4.3 PiB 84.55 538 TiB > >>>> test 71 29 MiB 6.56k 1.2 GiB 0 269 TiB > >>>> foo 72 1.2 GiB 308 3.6 GiB 0 269 TiB > >>>> > >>>> But then a few seconds later it's back to used == stored: > >>>> > >>>> Thu 26 Nov 19:27:03 CET 2020 > >>>> RAW STORAGE: > >>>> CLASS SIZE AVAIL USED RAW USED %RAW USED > >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.47 > >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.47 > >>>> > >>>> POOLS: > >>>> POOL ID STORED OBJECTS USED %USED MAX AVAIL > >>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49 538 TiB > >>>> test 71 29 MiB 6.56k 29 MiB 0 269 TiB > >>>> foo 72 1.2 GiB 308 1.2 GiB 0 269 TiB > >>>> > >>>> It seems to report the correct stats only when the PG is peering (so > >>>> some other transition state). > >>>> I've restarted all three relevant OSDs now -- the stats are reported > >>>> as stored == used. > >>>> > >>>> 2. Another data point -- I found another old cluster that reports > >>>> stored/used correctly. I have no idea what might be different about > >>>> that cluster -- we updated it just like the others. > >>>> > >>>> Cheers, Dan > >>>> > >>>> On Thu, Nov 26, 2020 at 6:22 PM Igor Fedotov <ifedotov@xxxxxxx> wrote: > >>>>> For specific BlueStore instance you can learn relevant statfs output by > >>>>> > >>>>> setting debug_bluestore to 20 and leaving OSD for 5-10 seconds (or may > >>>>> be a couple of minutes - don't remember exact statsfs poll period ). > >>>>> > >>>>> Then grep osd log for "statfs" and/or "pool_statfs" and get the output > >>>>> formatted as per the following operator (taken from src/osd/osd_types.cc): > >>>>> > >>>>> ostream& operator<<(ostream& out, const store_statfs_t &s) > >>>>> { > >>>>> out << std::hex > >>>>> << "store_statfs(0x" << s.available > >>>>> << "/0x" << s.internally_reserved > >>>>> << "/0x" << s.total > >>>>> << ", data 0x" << s.data_stored > >>>>> << "/0x" << s.allocated > >>>>> << ", compress 0x" << s.data_compressed > >>>>> << "/0x" << s.data_compressed_allocated > >>>>> << "/0x" << s.data_compressed_original > >>>>> << ", omap 0x" << s.omap_allocated > >>>>> << ", meta 0x" << s.internal_metadata > >>>>> << std::dec > >>>>> << ")"; > >>>>> return out; > >>>>> } > >>>>> > >>>>> But honestly I doubt this is BlueStore which reports incorrectly since > >>>>> it doesn't care about replication. > >>>>> > >>>>> It rather looks like lack of stats from some replicas or improper pg > >>>>> replica factor processing... > >>>>> > >>>>> Perhaps legacy vs. new pool what matters... Can you try to create a new > >>>>> pool at old cluster and fill it with some data (e.g. just a single 64K > >>>>> object) and check the stats? > >>>>> > >>>>> > >>>>> Thanks, > >>>>> > >>>>> Igor > >>>>> > >>>>> On 11/26/2020 8:00 PM, Dan van der Ster wrote: > >>>>>> Hi Igor, > >>>>>> > >>>>>> No BLUESTORE_LEGACY_STATFS warning, and > >>>>>> bluestore_warn_on_legacy_statfs is the default true on this (and all) > >>>>>> clusters. > >>>>>> I'm quite sure we did the statfs conversion during one of the recent > >>>>>> upgrades (I forget which one exactly). > >>>>>> > >>>>>> # ceph tell osd.* config get bluestore_warn_on_legacy_statfs | grep -v true > >>>>>> # > >>>>>> > >>>>>> Is there a command to see the statfs reported by an individual OSD ? > >>>>>> We have a mix of ~year old and recently recreated OSDs, so I could try > >>>>>> to see if they differ. > >>>>>> > >>>>>> Thanks! > >>>>>> > >>>>>> Dan > >>>>>> > >>>>>> > >>>>>> On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov <ifedotov@xxxxxxx> wrote: > >>>>>>> Hi Dan > >>>>>>> > >>>>>>> don't you have BLUESTORE_LEGACY_STATFS alert raised (might be silenced > >>>>>>> by bluestore_warn_on_legacy_statfs param) for the older cluster? > >>>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Igor > >>>>>>> > >>>>>>> > >>>>>>> On 11/26/2020 7:29 PM, Dan van der Ster wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> Depending on which cluster I look at (all running v14.2.11), the > >>>>>>>> bytes_used is reporting raw space or stored bytes variably. > >>>>>>>> > >>>>>>>> Here's a 7 year old cluster: > >>>>>>>> > >>>>>>>> # ceph df -f json | jq .pools[0] > >>>>>>>> { > >>>>>>>> "name": "volumes", > >>>>>>>> "id": 4, > >>>>>>>> "stats": { > >>>>>>>> "stored": 1229308190855881, > >>>>>>>> "objects": 294401604, > >>>>>>>> "kb_used": 1200496280133, > >>>>>>>> "bytes_used": 1229308190855881, > >>>>>>>> "percent_used": 0.4401889145374298, > >>>>>>>> "max_avail": 521125025021952 > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> Note that stored == bytes_used for that pool. (this is a 3x replica pool). > >>>>>>>> > >>>>>>>> But here's a newer cluster (installed recently with nautilus) > >>>>>>>> > >>>>>>>> # ceph df -f json | jq .pools[0] > >>>>>>>> { > >>>>>>>> "name": "volumes", > >>>>>>>> "id": 1, > >>>>>>>> "stats": { > >>>>>>>> "stored": 680977600893041, > >>>>>>>> "objects": 163155803, > >>>>>>>> "kb_used": 1995736271829, > >>>>>>>> "bytes_used": 2043633942351985, > >>>>>>>> "percent_used": 0.23379847407341003, > >>>>>>>> "max_avail": 2232457428467712 > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> In the second cluster, bytes_used is 3x stored. > >>>>>>>> > >>>>>>>> Does anyone know why these are not reported consistently? > >>>>>>>> Noticing this just now, I'll update our monitoring to plot stored > >>>>>>>> rather than bytes_used from now on. > >>>>>>>> > >>>>>>>> Thanks! > >>>>>>>> > >>>>>>>> Dan > >>>>>>>> _______________________________________________ > >>>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx