> On 02.10.2018 19:28, jesper@xxxxxxxx wrote: > In the cephfs world there is no central server that hold the cache. each > cephfs client reads data directly from the osd's. I can accept this argument, but nevertheless .. if I used Filestore - it would work. > This also means no > single point of failure, and you can scale out performance by spreading > metadata tree information over multiple MDS servers. and scale out > storage and throughput with added osd nodes. > > so if the cephfs client cache is not sufficient, you can look at at the > bluestore cache. http://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/#cache-size I have been there, but it seems to "not work"- I think the need to slice per OSD and statically allocate mem per OSD breaks the efficiency. (but I cannot prove it) > or you can look at adding a ssd layer over the spinning disks. with eg > bcache. I assume you are using a ssd/nvram for bluestore db already My currently bluestore(s) is backed by 10TB 7.2K RPM drives, allthough behind BBWC. Can you elaborate on the "assumption" as we're not doing that, I'd like to explore that. > you should also look at tuning the cephfs metadata servers. > make sure the metadata pool is on fast ssd osd's . and tune the mds > cache to the mds server's ram, so you cache as much metadata as possible. Yes, we're in the process of doing that - I belive we're seeing the MDS suffering when we saturate a few disks in the setup - and they are sharing. Thus we'll move the metadata as per recommendations to SSD. -- Jesper _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com