Re: Latency impact on RBD performance

Jan Schermer <jan@xxxxxxxxxxx> · Wed, 19 Aug 2015 17:10:05 +0200

This simply depends on what your workload is. I know this is a non-anwer for you but that's how it is.

Databases are the worst, because they tend to hit the disks with every transaction, and the transaction throughput is in direct proportion to the number of IOPS you can get. And the number of IOPS you can get (in this scenario) is basically iops=1000/latency_in_ms. There's not much parallelism when commiting.

So if your storage has a 2ms latency now it can achieve at most 500 IOPS (single thread/queue depth synchronous) - that can in theory equal to 500 durable transactions per second in a database

But in practice:
a) mysql/innodb has an option not to flush every transaction but every Xth transaction. With X=10 it's like having a 5000 IOPS disk more or less.
b) there is filesystem overhead - if you are appending to the transaction log you not only have to flush the data, but also filesystem metadata, and that could be many more IOPS before you even get to the transaction - I've seen a factor of 10(!) with no preallocation and xfs. That's terrible. But not every database does this, for example mysql creates and preallocates iblogfileX files so that should not be an issue if you run mysql.
c) there can be other IOs blocking the submission, you have a limited queue depth of in-flight iops so it can clog up even if you turn the synchronous IO into asynchronous, and you often have to flush even the asynchronous writes
d) we must not forget about reads - if there's a webserver connecting to the database then there it will need more memory because all requests will take longer (and they also often consume CPU even when "waiting"), and if there are multiple requests or subrequests it cascades and goes up fast.
-------

Take a look in iostat at the drive utilization and latency of your database's disk. Then calculate:

resulting_disk_utilization=(latency+10)/latency*utilization

Taking the figures from previous example, let's say you have a 2ms latency and the drive is 20% utilized: (2+10)/2*20 = 120%

My guess is that it's going to be horrible unless you turn everything into async IO, enable cache=unsafe in qemu and pray really hard :/

Jan

> On 19 Aug 2015, at 16:20, Logan Barfield <lbarfield@xxxxxxxxxxxxx> wrote:
> 
> Hi,
> 
> We are currently using 2 OSD hosts with SSDs to provide RBD backed volumes for KVM hypervisors.  This 'cluster' is currently set up in 'Location A'.
> 
> We are looking to move our hypervisors/VMs over to a new location, and will have a 1Gbit link between the two datacenters.  We can run Layer 2 over the link, and it should have ~10ms of latency.  Call the new datacenter 'Location B'.
> 
> One proposed solution for the migration is to set up new RBD hosts in the new location, set up a new pool, and move the VM volumes to it.
> 
> The potential issue with this solution is that we can end up in a scenario where the VM is running on a hypervisor in 'Location A', but writing/reading to a volume in 'Location B'.
> 
> My question is: what kind of performance impact should we expect when reading/writing over a link with ~10ms of latency?  Will it bring I/O intensive operations (like databases) to a halt, or will it be 'tolerable' for a short period (a few days).  Most of the VMs are running database backed e-commerce sites.
> 
> My expectation is that 10ms for every I/O operation will cause a significant impact, but we wanted to verify that before ruling it out as a solution.  We will also be doing some internal testing of course.
> 
> 
> I appreciate any feedback the community has.
> 
> - Logan 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com