Re: Ceph OSD imbalance and performance

Dave Ingram <dave@xxxxxxxxxxxx> · Tue, 28 Feb 2023 13:11:18 -0600

On Tue, Feb 28, 2023 at 12:56 PM Reed Dier <reed.dier@xxxxxxxxxxx> wrote:

> I think a few other things that could help would be `ceph osd df tree`
> which will show the hierarchy across different crush domains.
>

Good idea: https://pastebin.com/y07TKt52

> And if you’re doing something like erasure coded pools, or something other
> than replication 3, maybe `ceph osd crush rule dump` may provide some
> further context with the tree output.
>

No erasure coded pools - all replication.

>
> Also, the cluster is running Luminous (12) which went EOL 3 years ago
> tomorrow
> <https://docs.ceph.com/en/latest/releases/index.html#archived-releases>.
> So there are also likely a good bit of improvements all around under the
> hood to be gained by moving forward from Luminous.
>

Yes, nobody here wants to touch upgrading this at all - too terrified of
breaking things. This ceph deployment is serving several hundred VMs.

The general feeling is that we're stuck on luminous and that it's
destructive to upgrade to anything else. I refuse to believe that is true.
At least if we upgraded everything to 12.2.3 we'd have the 'balancer' stuff
that came with I think 12.2.2...

What would you recommend upgrading luminous to?

> Though, I would say take care of the scrub errors prior to doing any major
> upgrades, as well as checking your upgrade path (can only upgrade two
> releases at a time, if you have filestore OSDs, etc).
>

Yeah, there seems to be a fear that attempting to repair those will
negatively impact performance even more. I disagree and think we should do
them immediately.

Also, there seems to be a belief that bluestore is an 'all-or-nothing'
proposition and that it's impossible to migrate from filestore to
bluestore. Yet I see that you can have a mixture of both in your
deployments and it's indeed possible to migrate from filestore to bluestore.

TL;DR -- there is a *lot* of fear of touching this thing because nobody is
truly an 'expert' in it atm. But not touching it is why we've gotten
ourselves into a situation with broken stuff and horrendous performance.

Thanks Reed!
-Dave

>
> -Reed
>
> On Feb 28, 2023, at 11:12 AM, Dave Ingram <dave@xxxxxxxxxxxx> wrote:
>
> There is a
> lot of variability in drive sizes - two different sets of admins added
> disks sized between 6TB and 16TB and I suspect this and imbalanced
> weighting is to blame.
>
> CEPH OSD DF:
>
> (not going to paste that all in here): https://pastebin.com/CNW5RKWx
>
> What else am I missing in terms of what to share with you all?
>
> Thanks all,
> -Dave
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx