Re: A new rados api set_client_priority

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 10, 2015 at 2:43 AM, 瞿天善 <qutianshan@xxxxxxxxx> wrote:
> Hi,
>   I'm working on the topic
> <http://tracker.ceph.com/projects/ceph/wiki/Messenger_-_priorities_for_Client>
> , first step I add a new rados api set_client_priority, then I will do
> more test to get the idea of how to balance the performance.
>    code is in the pr <https://github.com/ceph/ceph/pull/5602>,
> originally client read the config to get priority.
>    Is the api ok?

The API looks fine at first glance, but I continue to be concerned
about exposing priorities like this. Right now they behave fairly
oddly. For instance:
1) If you have multiple message priorities on the same client, the
higher-priority ones always get sent out if there are any.
2) When receiving messages, the higher-priority ones in the queue are
always delivered ahead of lower-priority ones
2b) ...unless the ms_fast_dispatch interface is involved (as with
client<->OSD communication), in which case priority is ignored by the
receiving messenger and it's just dispatched instantly.
3) The priority is used internally by the OSD for a proper tokenized
delivery scheme.

I think for QOS like this you're most interested in the third option,
but simply setting message priority like this will also apply to the
monitors and MDSes (and do potentially strange things!), and without
guards on the results users could do stuff like set their priority to
the max value and basically crowd out everybody else in the cluster.
...come to think of it, that's probably an attack they could do
currently if they've got a RADOS key they control. But it's at least
*difficult* to do accidentally since the interfaces generally aren't
exposed. ;)

So before merging something like this I'd like to see us have a more
complete solution carved out:
0) Identify if we even want message priority and in-daemon processing
priority to be conflated.
1) What does the messenger do? Make it as consistent as possible,
probably with tokenization like the OSD has. (This won't be difficult
code-wise as we're already using PrioritizedQueue, but we use the
"strict" portion of it for all messages right now.) Make sure stuff
doesn't break.
2) Establish a way to specify different priorities for different
daemons. Just because somebody gets faster data IO doesn't mean they
get faster MDS access. Everybody should probably be equal to the
monitors. ...and this is one of the things that the proposed API would
need to be changed to support.
3) Add capabilities support for specifying priority limits, along with
a reasonable default. Check these on message receipt. (...and reject
messages outside the bounds? Move them in-bounds?)
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux