It's hard to comment on how your experience could be made better without more information about your configuration and how your testing. Anything along the lines of what LSI controller model, PCI-E bus speed, number of expander cables, drive type, number of SSDs and whether the SSDs were connected to the controller or directly to SATA2/SATA3 port on the mainboard. You mentioned using SSD journal but nothing about a writeback cache, did you try both? I'm also curious about what kind of workload didn't get better with an external journal, was this with rados-bench?
I'm really excited about tiering, it will disaggregate the SSDs and allow more flexibility in cephstore chassis selection because you no longer have to maintain strict SSD:drive ratios - this seems like a much more elegant and maintainable solution.
On Wed, Oct 9, 2013 at 3:45 PM, Warren Wang <warren@xxxxxxxxxxxxx> wrote:
While in theory this should be true, I'm not finding it to be the case for a typical enterprise LSI card with 24 drives attached. We tried a variety of ratios and went back to collocated journals on the spinning drives.Eagerly awaiting the tiered performance changes to implement a faster tier via SSD.
--WarrenJournal on SSD should effectively double your throughput because data will not be written to the same device twice to ensure transactional integrity. Additionally, by placing the OSD journal on an SSD you should see less latency, the disk head no longer has to seek back and forth between the journal and data partitions. For large writes it's not as critical to have a device that supports high IOPs or throughput because large writes are striped across many 4MB rados objects, relatively evenly distributed across the cluster. Small write operations will benefit the most from an OSD data partition with a writeback cache like btier/flashcache because it can absorbs an order of magnitude more IOPs and allow a slower spinning device catch up when there is less activity.On Tue, Oct 8, 2013 at 12:09 AM, Robert van Leeuwen <Robert.vanLeeuwen@xxxxxxxxxxxxx> wrote:
> I tried putting Flashcache on my spindle OSDs using an Intel SSL and it works great.Small note that on Red Hat based distro's + Flashcache + XFS:
> This is getting me read and write SSD caching instead of just write performance on the journal.
> It should also allow me to protect the OSD journal on the same drive as the OSD data and still get benefits of SSD caching for writes.
There is a major issue (kernel panics) running xfs + flashcache on a 6.4 kernel. (anything higher then 2.6.32-279)
It should be fixed in kernel 2.6.32-387.el6 which, I assume, will be 6.5 which only just entered Beta.
Fore more info, take a look here:
https://github.com/facebook/flashcache/issues/113
Since I've hit this issue (thankfully in our dev environment) we are slightly less enthusiastic about running flashcache :(
It also adds a layer of complexity so I would rather just run the journals on SSD, at least on Redhat.
I'm not sure about the performance difference of just journals v.s. Flashcache but I'd be happy to read any such comparison :)
Also, if you want to make use of the SSD trim func
P.S. My experience with Flashcache is on Openstack Swift & Nova not Ceph.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Kyle_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Kyle
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com