Re: Bluestore vs. Filestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2 Oct 2018, jesper@xxxxxxxx wrote:
> Hi.
> 
> Based on some recommendations we have setup our CephFS installation using
> bluestore*. We're trying to get a strong replacement for "huge" xfs+NFS
> server - 100TB-ish size.
> 
> Current setup is - a sizeable Linux host with 512GB of memory - one large
> Dell MD1200 or MD1220 - 100TB + a Linux kernel NFS server.
> 
> Since our "hot" dataset is < 400GB we can actually serve the hot data
> directly out of the host page-cache and never really touch the "slow"
> underlying drives. Except when new bulk data are written where a Perc with
> BBWC is consuming the data.
> 
> In the CephFS + Bluestore world, Ceph is "deliberatly" bypassing the host
> OS page-cache, so even when we have 4-5 x 256GB memory** in the OSD hosts
> it is really hard to create a synthetic test where they hot data does not
> end up being read out of the underlying disks. Yes, the
> client side page cache works very well, but in our scenario we have 30+
> hosts pulling the same data over NFS.
> 
> Is bluestore just a "bad fit" .. Filestore "should" do the right thing? Is
> the recommendation to make an SSD "overlay" on the slow drives?
> 
> Thoughts?

1. This sounds like it is primarily a matter of configuring the bluestore 
cache size.  This is the main downside of bluestore: it doesn't magically 
use any available RAM as a cache (like the OS page cache).

2. There are two other important options that control bluestore cache 
behavior:

 bluestore_default_buffered_read (default true)
 bluestore_default_buffered_write (default false)

Given your description it sounds like the default is fine: newly written 
data won't land it cache, but once it is read it will be there.  If you 
want recent writes to land it cache you can change the second option to 
true.

3. Because we don't ues the page cache, an OSD restart also drops the 
cache, so be sure to allow things to warm up after a restart before 
drawing conclusions about steady-state performance.

Hope that helps!
sage
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux