For specific BlueStore instance you can learn relevant statfs output by
setting debug_bluestore to 20 and leaving OSD for 5-10 seconds (or may
be a couple of minutes - don't remember exact statsfs poll period ).
Then grep osd log for "statfs" and/or "pool_statfs" and get the output
formatted as per the following operator (taken from src/osd/osd_types.cc):
ostream& operator<<(ostream& out, const store_statfs_t &s)
{
out << std::hex
<< "store_statfs(0x" << s.available
<< "/0x" << s.internally_reserved
<< "/0x" << s.total
<< ", data 0x" << s.data_stored
<< "/0x" << s.allocated
<< ", compress 0x" << s.data_compressed
<< "/0x" << s.data_compressed_allocated
<< "/0x" << s.data_compressed_original
<< ", omap 0x" << s.omap_allocated
<< ", meta 0x" << s.internal_metadata
<< std::dec
<< ")";
return out;
}
But honestly I doubt this is BlueStore which reports incorrectly since
it doesn't care about replication.
It rather looks like lack of stats from some replicas or improper pg
replica factor processing...
Perhaps legacy vs. new pool what matters... Can you try to create a new
pool at old cluster and fill it with some data (e.g. just a single 64K
object) and check the stats?
Thanks,
Igor
On 11/26/2020 8:00 PM, Dan van der Ster wrote:
Hi Igor,
No BLUESTORE_LEGACY_STATFS warning, and
bluestore_warn_on_legacy_statfs is the default true on this (and all)
clusters.
I'm quite sure we did the statfs conversion during one of the recent
upgrades (I forget which one exactly).
# ceph tell osd.* config get bluestore_warn_on_legacy_statfs | grep -v true
#
Is there a command to see the statfs reported by an individual OSD ?
We have a mix of ~year old and recently recreated OSDs, so I could try
to see if they differ.
Thanks!
Dan
On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov <ifedotov@xxxxxxx> wrote:
Hi Dan
don't you have BLUESTORE_LEGACY_STATFS alert raised (might be silenced
by bluestore_warn_on_legacy_statfs param) for the older cluster?
Thanks,
Igor
On 11/26/2020 7:29 PM, Dan van der Ster wrote:
Hi,
Depending on which cluster I look at (all running v14.2.11), the
bytes_used is reporting raw space or stored bytes variably.
Here's a 7 year old cluster:
# ceph df -f json | jq .pools[0]
{
"name": "volumes",
"id": 4,
"stats": {
"stored": 1229308190855881,
"objects": 294401604,
"kb_used": 1200496280133,
"bytes_used": 1229308190855881,
"percent_used": 0.4401889145374298,
"max_avail": 521125025021952
}
}
Note that stored == bytes_used for that pool. (this is a 3x replica pool).
But here's a newer cluster (installed recently with nautilus)
# ceph df -f json | jq .pools[0]
{
"name": "volumes",
"id": 1,
"stats": {
"stored": 680977600893041,
"objects": 163155803,
"kb_used": 1995736271829,
"bytes_used": 2043633942351985,
"percent_used": 0.23379847407341003,
"max_avail": 2232457428467712
}
}
In the second cluster, bytes_used is 3x stored.
Does anyone know why these are not reported consistently?
Noticing this just now, I'll update our monitoring to plot stored
rather than bytes_used from now on.
Thanks!
Dan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx