Hi!
> My deployments have seen many different versions of ceph. Pre 0.80.7, I've
> seen those numbers being pretty high. After upgrading to 0.80.7, all of a
> sudden, commit latency of all OSDs drop to 0-1ms, and apply latency remains
> pretty low most of the time.
We use now Ceph 0.80.7-1~bpo70+1 on Debian Wheezy + 3.16.4 kernel, backported
from Jessie. And I can't see commit latency in perf dumps, only "commitcycle_latency".
Is it the right perf parameters you discuss here? Our values are too high - on the nodes
with RAID0-per-disk it is between 20 and 120 ms, on the nodes with straight HBA
passtrough is worser - 200-600ms.
But apply latency is between 3 and 19ms with avg=7.2 ms, journal latencies are
also good = 0.49-1.84 ms.
All osds are 1Tb or 2Tb WD BLACK sata disks with disabled cache and "deadline"
io scheduler. Backstore is on xfs, mounted with noatime and inode64, frag<12%.
Jjournals are on DC S3700 200gb SSDs - 12 OSDs per SSD (I know that recommended
4-6 per single SSD). Client and cluster network is combined, network is on a pair of 10Gbit
HP5920 switches.
What do you think about my perf values? I think, that I miss sometings in config,
but cant get what exactly.
Also, I have another question: how bridging/bonding influenced the operation
latencies? We use native linux bond in failover mode between 10G (main) and
1G (failover) interfaces. By default 10Gbit choosed with higher priority so cluster
operates an 10Gbit speed.
Megov Igor
CIO, Yuterra
> My deployments have seen many different versions of ceph. Pre 0.80.7, I've
> seen those numbers being pretty high. After upgrading to 0.80.7, all of a
> sudden, commit latency of all OSDs drop to 0-1ms, and apply latency remains
> pretty low most of the time.
We use now Ceph 0.80.7-1~bpo70+1 on Debian Wheezy + 3.16.4 kernel, backported
from Jessie. And I can't see commit latency in perf dumps, only "commitcycle_latency".
Is it the right perf parameters you discuss here? Our values are too high - on the nodes
with RAID0-per-disk it is between 20 and 120 ms, on the nodes with straight HBA
passtrough is worser - 200-600ms.
But apply latency is between 3 and 19ms with avg=7.2 ms, journal latencies are
also good = 0.49-1.84 ms.
All osds are 1Tb or 2Tb WD BLACK sata disks with disabled cache and "deadline"
io scheduler. Backstore is on xfs, mounted with noatime and inode64, frag<12%.
Jjournals are on DC S3700 200gb SSDs - 12 OSDs per SSD (I know that recommended
4-6 per single SSD). Client and cluster network is combined, network is on a pair of 10Gbit
HP5920 switches.
What do you think about my perf values? I think, that I miss sometings in config,
but cant get what exactly.
Also, I have another question: how bridging/bonding influenced the operation
latencies? We use native linux bond in failover mode between 10G (main) and
1G (failover) interfaces. By default 10Gbit choosed with higher priority so cluster
operates an 10Gbit speed.
Megov Igor
CIO, Yuterra
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com