>>From explorations we and others have done, I think what we really want >>to do here is make it cheaper to lookup and open files. The FileStore >>is very much not optimized for this; a single lookup involves >>constructing the path from its components multiple times and I think >>even does the lookups more than once. >>Also, 250k or even 25k file descriptors is an awful lot to demand. ;) Could the new coming leveldb backend store help for this specific case ? ----- Mail original ----- De: "Gregory Farnum" <greg@xxxxxxxxxxx> À: "Haomai Wang" <haomaiwang@xxxxxxxxx> Cc: ceph-devel@xxxxxxxxxxxxxxx Envoyé: Mercredi 9 Avril 2014 16:15:14 Objet: Re: [Share]Performance tunning on Ceph FileStore with SSD backend On Wed, Apr 9, 2014 at 3:05 AM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: > Hi all, > > I would like to share some ideas about how to improve performance on > ceph with SSD. Not much preciseness. > > Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). > ceph version is 0.67.5(Dumping) > > At first, we find three bottleneck on filestore: > 1. fdcache_lock(changed in Firely release) > 2. lfn_find in omap_* methods > 3. DBObjectMap header > > According to my understanding and the docs in > ObjectStore.h(https://github.com/ceph/ceph/blob/master/src/os/ObjectStore.h), > I simply remove lfn_find in omap_* and fdcache_lock. I'm not fully > sure the correctness of this change, but it works well still now. "Simply remove"? I don't remember all the details, but I'm sure there's more to it than that if you want things to behave. > DBObjectMap header patch is on the pull request queue and may be > merged in the next feature merge window. > > With things above done, we get much performance improvement in disk > util and benchmark results(3x-4x). > > Next, we find fdcache size become the main bottleneck. For example, if > hot data range is 100GB, we need 25000(100GB/4MB) fd to cache. If hot > data range is 1TB, we need 250000(1000GB/4MB) fd to cache. With > increase "filestore_fd_cache_size", the cost of lookup(FDCache) and > cache miss is expensive and can't be afford. The implementation of > FDCache isn't O(1). So we only can get high performance on fdcache hit > range(maybe 100GB with 10240 fdcache size) and more data exceed the > size of fdcaceh will be disaster. If you want to cache more fd(102400 > fdcache size), the implementation of FDCache will bring on extra CPU > cost(can't be ignore) for each op. >From explorations we and others have done, I think what we really want to do here is make it cheaper to lookup and open files. The FileStore is very much not optimized for this; a single lookup involves constructing the path from its components multiple times and I think even does the lookups more than once. Also, 250k or even 25k file descriptors is an awful lot to demand. ;) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html