Re: Ceph on Solaris / Illumos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/15/2015 10:36 AM, Jake Young wrote:


On Wednesday, April 15, 2015, Mark Nelson <mnelson@xxxxxxxxxx
<mailto:mnelson@xxxxxxxxxx>> wrote:



    On 04/15/2015 08:16 AM, Jake Young wrote:

        Has anyone compiled ceph (either osd or client) on a Solaris
        based OS?

        The thread on ZFS support for osd got me thinking about using
        solaris as
        an osd server. It would have much better ZFS performance and I
        wonder if
        the osd performance without a journal would be 2x better.


    Doubt it.  You may be able to do a little better, but you have to
    pay the piper some how.  If you clone from journal you will
    introduce fragmentation.  If you throw the journal away you'll
    suffer for everything but very large writes unless you throw safety
    away.  I think if we are going to generally beat filestore (not just
    for optimal benchmarking tests!) it's going to take some very
    careful cleverness. Thankfully Sage is very clever and is working on
    it in newstore. Even there, filestore has been proving difficult to
    beat for writes.


That's interesting. I've been under the impression that the ideal
osd config was using a stable and fast BTRFS (which doesn't exist
yet) with no journal.

This is sort of unrelated to the journal specifically, but BTRFS with RBD will start fragmenting terribly due to how COW works (and how it relates to snapshots too). More related to the journal: At one point we were thinking about cloning from the journal on BTRFS, but that also potentially leads to nasty fragmentation even if the initial behavior would look very good. I haven't done any testing that I can remember of BTRFS with no journal. I'm not sure if it even still works...


In my specific case, I don't want to use an external journal. I've gone
down the path of using RAID controllers with write-back cache and BBUs
with each disk in its own RAID0 group, instead of SSD journals. (Thanks
for your performance articles BTW, they were very helpful!)

My take on your results indicates that IO throughput performance on XFS
with same disk journal and WB cache on the RAID card was basically the
same or better than BTRFS with no journal.  In addition, BTRFS typically
used much more CPU.

Has BTRFS performance gotten any better since you wrote the performance
articles?

So the trick with those articles is that the systems are fresh, and most of the initial articles were using rados bench which is always writing out new objects vs something like RBD where you are (usually) doing writes to existing objects that represent the blocks. If you were to do a bunch of random 4k writes and then later try to do sequential reads, you'd see BTRFS sequential read performance tank. We actually did tests like that with emperor during the firefly development cycle. I've included the results. Basically the first iteration of the test cycle looks great on BTRFS, then you see read performance drop way down. Eventually write performance also is likely drop as the disks become extremely fragmented (we may even see a little of that in those tests).


Have you compared ZFS (ZoL) performance to BTRFS?

I did way back in 2013 when we were working with Brian Behlendorf to fix xattr bugs in ZOL. It was quite a bit slower if you didn't enable SA xattrs. With SA xattrs, it was much closer, but not as fast as btrfs or xfs. I didn't do a lot of tuning though and Ceph wasn't making good use of ZFS features, so it's very possible things have changed.



Attachment: Emeror Raw Performance Data.ods
Description: application/vnd.oasis.opendocument.spreadsheet

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux