Re: OSD Performance

Mark Nelson <mnelson@xxxxxxxxxx> · Tue, 24 Feb 2015 16:55:50 -0600

Hi Kevin,

Writes are probably limited by a combination of locks, concurrent 
O_DSYNC journal writes, fsyncs, etc.  The tests I mentioned were with 
both the OSD and the OSD journal on the same PCIe SSD.  Others have 
looked into this in more detail than I have so  might be able to chime 
in with more specifics.

Mark

On 02/24/2015 04:50 PM, Kevin Walker wrote:
Hi Mark

Thanks for the info, 22k is not bad, but still massively below what a pcie ssd can achieve. Care to expand on why the write IOPS are so low? Was this with a separate RAM disk pcie device or SLC SSD for the journal?

That fragmentation percentage looks good. We are considering using just SSD's for OSD's and RAM disk pcie devices for the Journals so this would be ok.

Kind regards

Kevin Walker
+968 9765 1742

On 25 Feb 2015, at 02:35, Mark Nelson <mnelson@xxxxxxxxxx> wrote:

On 02/24/2015 04:21 PM, Kevin Walker wrote:
Hi All

Just recently joined the list and have been reading/learning about ceph
for the past few months. Overall it looks to be well suited to our cloud
platform but I have stumbled across a few worrying items that hopefully
you guys can clarify the status of.

Reading through various mailing list archives, it would seem an OSD caps
out at about 3k IOPS. Dieter Kasper from Fujistu made an interesting
observation about the size of the OSD code(20k plus lines at that time),
is this being optimized further and has this IOPS limit been improved in
Giant?

In recent tests under fairly optimal conditions, I'm seeing performance topping out at about 4K object writes/s and 22K object reads/s against an OSD with a very fast PCIe SSD.  There are several reasons writes are slower than reads, but this is something we are working on improving in a variety of ways.

I believe others may have achieved even higher results.

Is there a way to over come the XFS fragmentation problems other users
have experienced?

Setting the newish filestore_xfs_extsize parameter to true appears to help in testing we did a couple months ago.  We filled up a cluster to near capacity (~70%) and then did 12 hours of random writes.  After the test completed, with filestore_xfs_extsize disabled we were seeing something like 13% fragmentation, while with it enabled we were seeing around 0.02% fragmentation.

Kind regards

Kevin

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com