On Mon, May 6, 2019 at 6:41 PM Kyle Brantley <kyle@xxxxxxxxxxxxxx> wrote: > > On 5/6/2019 6:37 PM, Gregory Farnum wrote: > > Hmm, I didn't know we had this functionality before. It looks to be > > changing quite a lot at the moment, so be aware this will likely > > require reconfiguring later. > > Good to know, and not a problem. In any case, I'd assume it won't change substantially for luminous, correct? > > > > I'm not seeing this in the luminous docs, are you sure? The source > > You're probably right, but there are options for this in luminous: > > # ceph osd pool get vm > Invalid command: missing required parameter var([...] recovery_priority|recovery_op_priority [...]) > > > > code indicates in Luminous it's 0-254. (As I said, things have > > changed, so in the current master build it seems to be -10 to 10 and > > configured a bit differently.) > > > The 1-63 values generally apply to op priorities within the OSD, and > > are used as part of a weighted priority queue when selecting the next > > op to work on out of those available; you may have been looking at > > osd_recovery_op_priority which is on that scale and should apply to > > individual recovery messages/ops but will not work to schedule PGs > > differently. > > So I was probably looking at the OSD level then. Ah sorry, I looked at the recovery_priority option and skipped recovery_op_priority entirely. So recovery_op_priority sets the priority on the message dispatch itself and is on the 0-63 scale. I wouldn't mess around with that; the higher you put it the more of them will be dispatched compared to client operations. > > > > >> Questions: > >> 1) If I have pools 1-4, what would I set these values to in order to backfill pools 1, 2, 3, and then 4 in order? > > > > So if I'm reading the code right, they just need to be different > > weights, and the higher value will win when trying to get a > > reservation if there's a queue of them. (However, it's possible that > > lower-priority pools will send off requests first and get to do one or > > two PGs first, then the higher-priority pool will get to do all its > > work before that pool continues.) > > Where higher is 0, or higher is 254? And what's the difference between recovery_priority and recovery_op_priority? For recovery_priority larger numbers are higher. When picking a PG off the list of pending reservations, it will take the highest priority PG it sees, and the first request to come in within that priority. > > In reading the docs for the OSD, _op_ is "priority set for recovery operations," and non-op is "priority set for recovery work queue." For someone new to ceph such as myself, this reads like the same thing at a glance. Would the recovery operations not be a part of the work queue? > > And would this apply the same for the pools? When a PG needs to recover, it has to acquire a reservation slot on the local and remote nodes (to limit the total amount of work being done). It sends off a request and when the total number of reservations is hit, they go into a pending queue. The recovery_priority orders that queue. > > > > >> 2) Assuming this is possible, how do I ensure that backfill isn't prioritized over client I/O? > > > > This is an ongoing issue but I don't think the pool prioritization > > will change the existing mechanisms. > > Okay, understood. Not a huge problem, I'm primarily looking for understanding. > > > >> 3) Is there a command that enumerates the weights of the current operations (so that I can observe what's going on)? > > > > "ceph osd pool ls detail" will include them. > > > > Perfect! > > Thank you very much for the information. Once I have a little more, I'm probably going to work towards sending a pull request in for the docs... > > > --Kyle _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com