Re: Persistent Write Back Cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Christian,

Yes that's correct, it's on the client side. I don't see this much different
to a battery backed Raid controller, if you lose power, the data is in the
cache until power resumes when it is flushed.

If you are going to have the same RBD accessed by multiple servers/clients
then you need to make sure the SSD is accessible to both (eg DRBD / Dual
Port SAS). But then something like pacemaker would be responsible for
ensuring the RBD and cache device are both present before allowing client
access.

When I wrote this I was thinking more about 2 HA iSCSI servers with RBD's,
however I can understand that this feature would prove more of a challenge
if you are using Qemu and RBD.

Nick

-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
Christian Balzer
Sent: 04 March 2015 08:40
To: ceph-users@xxxxxxxxxxxxxx
Cc: Nick Fisk
Subject: Re:  Persistent Write Back Cache


Hello,

If I understand you correctly, you're talking about the rbd cache on the
client side.

So assume that host or the cache SSD on if fail terminally.
The client thinks its sync'ed are on the permanent storage (the actual ceph
storage cluster), while they are only present locally. 

So restarting that service or VM on a different host now has to deal with
likely crippling data corruption.

Regards,

Christian

On Wed, 4 Mar 2015 08:26:52 -0000 Nick Fisk wrote:

> Hi All,
> 
>  
> 
> Is there anything in the pipeline to add the ability to write the 
> librbd cache to ssd so that it can safely ignore sync requests? I have 
> seen a thread a few years back where Sage was discussing something 
> similar, but I can't find anything more recent discussing it.
> 
>  
> 
> I've been running lots of tests on our new cluster, buffered/parallel 
> performance is amazing (40K Read 10K write iops), very impressed. 
> However sync writes are actually quite disappointing.
> 
>  
> 
> Running fio with 128k block size and depth=1, normally only gives me 
> about 300iops or 30MB/s. I'm seeing 2-3ms latency writing to SSD OSD's 
> and from what I hear that's about normal, so I don't think I have a 
> ceph config problem. For applications which do a lot of sync's, like 
> ESXi over iSCSI or SQL databases, this has a major performance impact.
> 
>  
> 
> Traditional storage arrays work around this problem by having a 
> battery backed cache which has latency 10-100 times less than what you 
> can currently achieve with Ceph and an SSD . Whilst librbd does have a 
> writeback cache, from what I understand it will not cache syncs and so 
> in my usage case, it effectively acts like a write through cache.
> 
>  
> 
> To illustrate the difference a proper write back cache can make, I put 
> a 1GB (512mb dirty threshold) flashcache in front of my RBD and 
> tweaked the flush parameters to flush dirty blocks at a large queue 
> depth. The same fio test (128k iodepth=1) now runs at 120MB/s and is 
> limited by the performance of SSD used by flashcache, as everything is 
> stored as 4k blocks on the ssd. In fact since everything is stored as 
> 4k blocks, pretty much all IO sizes are accelerated to max speed of the
SSD.
> Looking at iostat I can see all the IO's are getting coalesced into 
> nice large 512kb IO's at a high queue depth, which Ceph easily swallows.
> 
>  
> 
> If librbd could support writing its cache out to SSD it would 
> hopefully achieve the same level of performance and having it 
> integrated would be really neat.
> 
>  
> 
> Nick
> 
> 
> 
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux