Re: Bluestore vs. Filestore

jesper@xxxxxxxx · Tue, 2 Oct 2018 21:21:30 +0200

> On 02.10.2018 19:28, jesper@xxxxxxxx wrote:
> In the cephfs world there is no central server that hold the cache. each
> cephfs client reads data directly from the osd's.

I can accept this argument, but nevertheless .. if I used Filestore - it
would work.

> This also means no
> single point of failure, and you can scale out performance by spreading
> metadata tree information over multiple MDS servers. and scale out
> storage and throughput with added osd nodes.
>
> so if the cephfs client cache is not sufficient, you can look at at the
> bluestore cache.
http://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/#cache-size

I have been there, but it seems to "not work"- I think the need to
slice per OSD and statically allocate mem per OSD breaks the efficiency.
(but I cannot prove it)

> or you can look at adding a ssd layer over the spinning disks. with egÂ 
> bcache.Â  I assume you are using a ssd/nvram for bluestore db already

My currently bluestore(s) is backed by 10TB 7.2K RPM drives, allthough behind
BBWC. Can you elaborate on the "assumption" as we're not doing that, I'd like
to explore that.

> you should also look at tuning the cephfs metadata servers.
> make sure the metadata pool is on fast ssd osd's .Â  and tune the mds
> cache to the mds server's ram, so you cache as much metadata as possible.

Yes, we're in the process of doing that - I belive we're seeing the MDS
suffering
when we saturate a few disks in the setup - and they are sharing. Thus
we'll move
the metadata as per recommendations to SSD.

-- 
Jesper

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com