Hi Philip,
I'm not sure if we're talking about the same thing but I was also
confused when I didn't see 100% OSD drive utilization during my first
RBD write benchmark. Since then I collect all my confusion here
https://yourcmc.ru/wiki/Ceph_performance :)
100% RBD utilization means that something waits for some I/O ops on this
device to complete all the time.
This "something" (client software) can't produce more I/O operations
while it's waiting for previous ones to complete, that's why it can't
saturate your OSDs and your network.
OSDs can't send more write requests to the drives while they're not done
with calculating object states on the CPU or while they're busy with
network I/O. That's why OSDs can't saturate drives.
Simply said: Ceph is slow. Partly because of the network roundtrips (you
have 3 of them: client -> iscsi -> primary osd -> secondary osds),
partly because it's just slow.
Of course it's not TERRIBLY slow, so software that can send I/O requests
in batches (i.e. use async I/O) feels fine. But software that sends I/Os
one by one (because of transactional requirements or just stupidity like
Oracle) runs very slow.
Also..
"It seems like your RBD can't flush it's I/O fast enough"
implies that there is some particular measure of "fast enough", that
is a tunable value somewhere.
If my network cards arent blocked, and my OSDs arent blocked...
then doesnt that mean that I can and should "turn that knob" up?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com