Re: Request for Comments: Weighted Round Robin OP Queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Robert,

It definitely is exciting I think. Keep up the good work! :)

Mark

On 11/05/2015 09:14 AM, Robert LeBlanc wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Thanks Gregory,

People are most likely busy and haven't had time to digest this and I
may be expecting more excitement from it (I'm excited due to the
results and probably also that such a large change still works). I'll
keep working towards a PR, this was mostly proof of concept, now that
there is some data I'll clean up the code.

I was thinking that a config option to choose the scheduler would be a
good idea. In terms of the project what is the better approach: create
a new template and each place the template class is instantiated
select the queue, or perform the queue selection in the same template
class, or something else I haven't thought of.

Are there public teuthology-openstack systems that could be used for
testing? I don't remember, I'll have to search back through the
mailing list archives.

I appreciate all the direction as I've tried to figure this out.

Thanks,
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Nov 4, 2015 at 8:20 PM, Gregory Farnum  wrote:
On Wed, Nov 4, 2015 at 7:00 PM, Robert LeBlanc  wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Thanks for your help on IRC Samuel. I think I found where I made a
mistake. I'll do some more testing. So far with max_backfills=1 on
spindles, the impact of setting an OSD out and in on a saturated
cluster seems to be minimal. On my I/O graphs it is hard to tell where
the OSD was out and in recovering. If I/O becomes blocked, it seems
that they don't linger around long. All of the clients report getting
about the same amount of work done with little variance so no one
client is getting indefinitely blocked (or blocked for really long
times) causing the results between clients to be skewed like before.

So far this queue seems to be very positive. I'd hate to put a lot of
working getting this ready to merge if there is little interest in it
(a lot of things to do at work and some other things I'd like to track
down in the Ceph code as well). What are some of the next steps for
something like this, meaning a pretty significant change to core code?

Well, step one is to convince people it's worthwhile. Your performance
information and anecdotal evidence of client impact is a pretty good
start. For it to get merged:
1) People will need to review it and verify it's not breaking anything
they can identify from code. Things are a bit constricted right now,
but this is pretty small and of high interest so I make no promises
for the core team but submitting a PR will be the way to start.
Getting positive buy-in from other contributors who are interested in
performance will also push it up the queue.
2) There will need to be a lot of testing on something like this.
Everything has to pass a run of the RADOS suite. Unfortunately this is
a bad month for that as the lab is getting physically shipped around
in a few weeks, so if you can afford to make it happen with the
teuthology-openstack stuff that will accelerate the timeline a lot (we
will still need to run it ourselves but once it's passed externally we
can put it in a lot more test runs we expect to pass, instead of in a
bucket with others that will all get blocked on any one failure).
3) For a new queuing system I suspect that rather than a direct merge
to default master, Sam will want to keep both in the code for a while
with a config value and run a lot of the nightlies on this one to
tease out any subtle races and bugs.
4) Eventually we become confident that it's in good shape and it
replaces the old queue.

Obviously those are the optimistic steps. ;)
-Greg


Thank you to all who took time to help point me in the right direction.
- ----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Nov 4, 2015 at 12:49 PM, Samuel Just  wrote:
I didn't look into it closely, but that almost certainly means that
your queue is reordering primary->replica replicated write messages.
-Sam


-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.2.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWOsY1CRDmVDuy+mK58QAA9LIQALIUgbS4BuDS704HPOpA
XwvGxspelMCaBkLHLgiHU4T/Jc8JaXhgdRMwMiKeLI246Z7hRngSGlIDYc4+
nP4kWZIkwbJeTa/Z6bM6C3itFtJmQpkPvdjI+GiME5ZdYvFgCZQyDD71rqja
H14m0+JsEaIHQF0JZz6OyNxbyRWsM+M68nOvpAx8/fOGHBC/0VwPbLrOUP9O
3J3NvbhN9xlYJeivXSAyzxmHQDD8mO1c1AUTrHgnTViD2k3fmcH0mOHIJ+jn
ARZbeLN3hlXG0i9PHpnHzBVNSxsfb5VPxX970R3gvRWIt40QV/QL7q2SajWP
ofxgEpkaO48ANQSYDlqSNcM+w46TtgcJljtX0vbrHIW3Skyaz4UZQ/dzX4lX
a5Zzk01oFwXfMd10KgVbJf78qVYHy2r5aq46iFnrFLU43iy+Qve7Kex4XZFi
vPFFVea89Of838NqTxW21+3oJthrz1g7RKHghZAbXaj3WKchuEU+uVG4XTo1
0PU4a5ZYVTH6zYHpwJo2/89OzdkBe9S6s00+4JmfVWWEhb6+QwUjBQp1TJbB
TnMzSKfzgRyi/wHThv2XcZN12tttZMM2L4Ea3mHG+cxOTTZ1opv8/H2mprm8
7UuO4vk5K0c4IwPVmt9m5DTVhyn4hZ/QJmc+NARD3zc1u3qWFLkH2WaRMpBb
mRWA
=kgAl
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v1.2.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWO3JgCRDmVDuy+mK58QAAeCcP/jHnG3r257cdcRYZzg9o
iMOxnuKAXNnwscYzJysCHsoQ2S3dB9SCxt8r+QvDo09IkXzarFaW647nzG6H
zeCtbhx2NFU/jOqPip/8XDUaDYlDrjHuskDJwz+jzoaZfWPjLfPkmETU/8vh
mrGZH+kjYuu1WhmM8cGJZJLrKA7C2OPTAU5PRmx5enClHXhdxyskZ7BUxcXp
uPJJg7pemT/qaJPrO7e7wwhYw43GaeSULp8QGFsqireCwbv9mndB7bbOa40U
ElHmgWgcG1UkkydW/U9DaJHM52ZbrAuG7XkZRsmB1oTmVriEoOFYSiGv5F+R
Mjxe9OlqiL9Fd/AQXunAAMdwIU5T3mlkrxMvhroRkW2+EerrRVW3JbJ8gmQ9
lXPRw9RxcQY5m8S+8+CWikBHvsRBCXEGA8tXUYqLuDJKpRHeCo7PpONS3III
QB+tgWaMteoeJGZ7nGLFcaKxTGa1tNKju4M2845/L8Fawy8jdYYcLqOTUs80
M1gpQ0UHzTXdQEdQnufxgaCFfwblF5vIlr6qd89rR5m0eJipElQLi2Uh0Zd3
0t0i0xtFdprkxDmzX/bzbARAnlS1cz/yoB85r3JxeNPev671mocQc0uyFkt7
P04ogGWzLBN5B4nWNWDznOZS52G+vhkFxryUyl9+LDafAKiTTPmhB/LXPMs+
7ny7
=Xg0t
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux