Hey Haomai, Nice work, by an chance do you have a branch that contains all the changes you’ve made? So that people can try themselves :) I look forward to reading more results :) Thanks! Cheers. –––– Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien.han@xxxxxxxxxxxx Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance On 09 Apr 2014, at 14:08, Alexandre DERUMIER <aderumier@xxxxxxxxx> wrote: > Hi, > > thanks for sharing ! > (I'm looking to build a full ssd cluster too with 1TB ssd) > >>> With these, we achieve 3x-4x performance improvements on filestore with SSD. > > Do you have some iops values benchmark, before and after ? > > > > ----- Mail original ----- > > De: "Haomai Wang" <haomaiwang@xxxxxxxxx> > À: ceph-devel@xxxxxxxxxxxxxxx > Envoyé: Mercredi 9 Avril 2014 12:05:19 > Objet: [Share]Performance tunning on Ceph FileStore with SSD backend > > Hi all, > > I would like to share some ideas about how to improve performance on > ceph with SSD. Not much preciseness. > > Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). > ceph version is 0.67.5(Dumping) > > At first, we find three bottleneck on filestore: > 1. fdcache_lock(changed in Firely release) > 2. lfn_find in omap_* methods > 3. DBObjectMap header > > According to my understanding and the docs in > ObjectStore.h(https://github.com/ceph/ceph/blob/master/src/os/ObjectStore.h), > I simply remove lfn_find in omap_* and fdcache_lock. I'm not fully > sure the correctness of this change, but it works well still now. > > DBObjectMap header patch is on the pull request queue and may be > merged in the next feature merge window. > > With things above done, we get much performance improvement in disk > util and benchmark results(3x-4x). > > Next, we find fdcache size become the main bottleneck. For example, if > hot data range is 100GB, we need 25000(100GB/4MB) fd to cache. If hot > data range is 1TB, we need 250000(1000GB/4MB) fd to cache. With > increase "filestore_fd_cache_size", the cost of lookup(FDCache) and > cache miss is expensive and can't be afford. The implementation of > FDCache isn't O(1). So we only can get high performance on fdcache hit > range(maybe 100GB with 10240 fdcache size) and more data exceed the > size of fdcaceh will be disaster. If you want to cache more fd(102400 > fdcache size), the implementation of FDCache will bring on extra CPU > cost(can't be ignore) for each op. > > Because of the capacity of SSD(several hundreds GB), we try to > increase the size of rbd object(16MB) so less fd cache is needed. As > for FDCache implementation, we simply discard SimpleLRU but introduce > RandomCache. Now we can set much larger fdcache size(near cache all > fd) with little overload. > > With these, we achieve 3x-4x performance improvements on filestore with SSD. > > Maybe it exists something I missed or something wrong, hope can > correct me. I hope it can help to improve FileStore on SSD and push > into master branch. > > -- > > Best Regards, > > Wheat > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail