Re: Prioritized pool recovery

Kyle Brantley <kyle@xxxxxxxxxxxxxx> · Mon, 6 May 2019 19:41:44 -0600

On 5/6/2019 6:37 PM, Gregory Farnum wrote:
Hmm, I didn't know we had this functionality before. It looks to be
changing quite a lot at the moment, so be aware this will likely
require reconfiguring later.

Good to know, and not a problem. In any case, I'd assume it won't change substantially for luminous, correct?

I'm not seeing this in the luminous docs, are you sure? The source

You're probably right, but there are options for this in luminous:

# ceph osd pool get vm
Invalid command: missing required parameter var([...] recovery_priority|recovery_op_priority [...])

code indicates in Luminous it's 0-254. (As I said, things have
changed, so in the current master build it seems to be -10 to 10 and
configured a bit differently.)

The 1-63 values generally apply to op priorities within the OSD, and
are used as part of a weighted priority queue when selecting the next
op to work on out of those available; you may have been looking at
osd_recovery_op_priority which is on that scale and should apply to
individual recovery messages/ops but will not work to schedule PGs
differently.

So I was probably looking at the OSD level then.

Questions:
1) If I have pools 1-4, what would I set these values to in order to backfill pools 1, 2, 3, and then 4 in order?

So if I'm reading the code right, they just need to be different
weights, and the higher value will win when trying to get a
reservation if there's a queue of them. (However, it's possible that
lower-priority pools will send off requests first and get to do one or
two PGs first, then the higher-priority pool will get to do all its
work before that pool continues.)

Where higher is 0, or higher is 254? And what's the difference between recovery_priority and recovery_op_priority?

In reading the docs for the OSD, _op_ is "priority set for recovery operations," and non-op is "priority set for recovery work queue." For someone new to ceph such as myself, this reads like the same thing at a glance. Would the recovery operations not be a part of the work queue?

And would this apply the same for the pools?

2) Assuming this is possible, how do I ensure that backfill isn't prioritized over client I/O?

This is an ongoing issue but I don't think the pool prioritization
will change the existing mechanisms.

Okay, understood. Not a huge problem, I'm primarily looking for understanding.

3) Is there a command that enumerates the weights of the current operations (so that I can observe what's going on)?

"ceph osd pool ls detail" will include them.

Perfect!

Thank you very much for the information. Once I have a little more, I'm probably going to work towards sending a pull request in for the docs...

--Kyle
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com