Re: using Bcache on blueStore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12.10.2017 20:28, Jorge Pinilla López wrote:
> Hey all!
> I have a ceph with multiple HDD and 1 really fast SSD with (30GB per OSD) per host.
> 
> I have been thinking and all docs say that I should give all the SSD space for RocksDB, so I would have a HDD data and a 30GB partition for RocksDB.
> 
> But it came to my mind that if the OSD isnt full maybe I am not using all the space in the SSD, or maybe I prefer having a really small amount of hot k/v and metadata and the data itself in a really fast device than just storing all could metadata.
> 
> So I though that using Bcache to make SSD to be a cache and as metadata and k/v are usually hot, they should be place on the cache. But this doesnt guarantee me that k/v and metadata are actually always in the SSD cause under heavy cache loads it can be pushed out (like really big data files).
> 
> So I came up with the idea of setting small 5-10GB partitions for the hot RocksDB and the rest to use it as a cache, so I make sure that really hot metadata is actually always on the SSD and the coulder one should be also on the SSD (as a bcache) if its not really freezing, in that case they would be pushed to the HDD. It also doesnt make anysense to have metadatada that you never used using space on the SSD, I rather use that space to store hotter data.
> 
> This is also make writes faster, and in blueStore we dont have the double write problem so it should work fine.
> 
> What do you think about this? does it have any downsite? is there any other way?

Hi Jorge
  I was inexperienced and tried bcache on old fsstore once. It was bad.
Mostly because bcache does not have any typical disk scheduling algorithm.
So when scrub or rebalnce was running latency on such storage was very high and unpredictable.
OSD deamon could not give any ioprio for disks read or writes, and additionaly
bcache cache was poisoned by scrub/rebalance.

Fortunately to me, it is very easy to rolling replace OSDs.
I use some SSDs partitions for journal now and what left for pure ssd storage.
This works really great .

If i will ever need cache, i will use cache tiering instead .


-- 
  Kind Regards
    Marek Grzybowski





_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux