Hello, On Mon, 3 Jul 2017 13:01:06 +0200 Mateusz Skała wrote: > Hello, > > We are using cache-tier in Read-forward mode (replica 3) for accelerate > reads and journals on SSD to accelerate writes. OK, lots of things wrong with this statement, but firstly, Ceph version (it is relevant) and more details about your setup and SSDs used would be interesting and helpful. If you had searched the ML archives for readforward you'd come across a very recent thread by me, in which the powers that be state that this mode is dangerous and not recommended. During quite some testing with this mode I never encountered any problems, but consider yourself warned. Now readforward will FORWARD reads to the backing storage, so it will NEVER accelerate reads (promote them to the cache-tier). The only speedup you will see is for objects that have been previously written and are still in the cache-tier. Using cache-tiers can work beautifully if you understand the I/O patterns involved (tricky on a cloud storage with very mixed clients), can make your cache-tier large enough to cover the hot objects (working set) or at least (as you are attempting) to segregate the read and write paths as much as possible. > We are using only RBD. Based > on the ceph-docs, RBD have bad I/O pattern for cache tier. I'm looking for > information about other possibility to accelerate reads on RBD with SSD > drives. > The documentation rightly warns about things, so people don't have unrealistic expectations. However YOU need to look at YOUR loads, patterns and usage and then decide if it is beneficial or not. As I hinted above, analyze your systems, are the reads actually slow or are they slowed down by competing writes to the same storage? Cold reads (OSD server just rebooted, no cache has that object in it) will obviously not benefit from any scheme. Reads from the HDD OSDs can very much benefit by having enough RAM to hold all the SLAB objects (direntry etc) in memory, so you can avoid disk access to actually find the object. Speeding up the actual data read you have the option of the cache-tier (in writeback mode, with proper promotion and retention configuration). Or something like bcache on the OSD servers, discussed here several times as well. > The second question, is it any cache tier mode, that replica can be set on > 1, for best use of SSD space? > A cache-tier (the same true for any other real cache methods) will at any given time have objects in it that are NOT on the actual backing storage when it is used to cache writes. So it needs to be just as redundant as the rest of the system, at least a replica of 2 with sufficiently small/fast SSDs. With bcache etc just caching reads, you can get away with a single replication of course, however failing SSDs may then cause your cluster to melt down. Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Rakuten Communications _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com