Re: multiple journals on SSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

did you actually read my full reply last week, the in-line parts,
not just the top bit?

http://www.spinics.net/lists/ceph-users/msg29266.html

On Tue, 12 Jul 2016 16:16:09 +0300 George Shuklin wrote:

> Yes, linear io speed was concern during benchmark. I can not predict how 
> much linear IO would be generated by clients (compare to IOPS) so we 
> going to balance HDD-OSD per SSD according to real usage. If users would 
> generate too much random IO, we will raise HDD/SSD ratio, if they would 
> generate more linear-write load, we will reduce that number. I plan to 
> do it by reserving space for 'more HDD' or 'more SSD' in planned servers 
> - they will go to production with ~50% slot utilization.
> 
Journal writes are always "linear", in a fashion.
And Ceph journals only sees writes, never reads.

So what your SSD sees is n sequential (with varying lengths, mind ya)
write streams and that's all.
Where n is the number of journals.

> My main concern is that random IO for OSD includes not only writes, but 
> reads too, and cold random read will slower HDD performance 
> significantly. On my previous experience, any weekly cronjob on server 
> with backup (or just 'find /') cause bad spikes of cold read, and that 
> drastically diminish HDD performance.
>
As I wrote last week, reads have nothing to do with journals.

To improve random, cold reads your options are:

1. Enough RAM on the OSD storage nodes to hold all dentries and other SLAB
bits in memory, this will dramatically reduce seeks.
2. Cache tiering, correctly configured and sized of course.
3. Read-ahead settings with RBD or your client VMs.

Lastly, anything that keeps writes and reads competing for the few HDD
IOPS there are, so journal SSDs, controller HW caches and again cache
pools. 
 
Christian

> (TL;DR; I don't belive we ever would have conditions when HDD can give 
> few dozens of MB/s of writing).
> 
> Thank you for advice.
> 
> On 07/12/2016 04:03 PM, Vincent Godin wrote:
> > Hello.
> >
> > I've been testing Intel 3500 as journal store for few HDD-based OSD. I
> > stumble on issues with multiple partitions (>4) and UDEV (sda5, sda6,etc
> > sometime do not appear after partition creation). And I'm thinking that
> > partition is not that useful for OSD management, because linux do no
> > allow partition rereading with it contains used volumes.
> >
> > So my question: How you store many journals on SSD? My initial thoughts:
> >
> > 1)  filesystem with filebased journals
> > 2) LVM with volumes
> >
> > Anything else? Best practice?
> >
> > P.S. I've done benchmarking: 3500 can support up to 16 10k-RPM HDD.
> >
> > Hello,
> >
> > I would like to advertise you not using 1 SSD for 16 HDD. Ceph journal 
> > is not only a journal but a write cache during operation. I had that 
> > kind of configuration with 1 SSD for 20  SATA HDD. With a Ceph bench, 
> > i notice that my rate whas limited between 350 and 400 MB/s. In fact, 
> > a iostat show me that my SSD was 100% utilised with a rate of 350-400 
> > MB/s.
> >
> > If you consider that a SATA HDD can have a max average rate of 100 
> > MB/s, you need to configure one SSD (which can rate till 400 MB/s) for 
> > 4 SATA HDD
> >
> > Vincent
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux