Re: [Share]Performance tunning on Ceph FileStore with SSD backend

Alexandre DERUMIER <aderumier@xxxxxxxxx> · Fri, 11 Apr 2014 08:04:15 +0200 (CEST)

>>From explorations we and others have done, I think what we really want
>>to do here is make it cheaper to lookup and open files. The FileStore
>>is very much not optimized for this; a single lookup involves
>>constructing the path from its components multiple times and I think
>>even does the lookups more than once.
>>Also, 250k or even 25k file descriptors is an awful lot to demand. ;)

Could the new coming leveldb backend store help for this specific case ?

----- Mail original ----- 

De: "Gregory Farnum" <greg@xxxxxxxxxxx> 
À: "Haomai Wang" <haomaiwang@xxxxxxxxx> 
Cc: ceph-devel@xxxxxxxxxxxxxxx 
Envoyé: Mercredi 9 Avril 2014 16:15:14 
Objet: Re: [Share]Performance tunning on Ceph FileStore with SSD backend 

On Wed, Apr 9, 2014 at 3:05 AM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: 
> Hi all, 
> 
> I would like to share some ideas about how to improve performance on 
> ceph with SSD. Not much preciseness. 
> 
> Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). 
> ceph version is 0.67.5(Dumping) 
> 
> At first, we find three bottleneck on filestore: 
> 1. fdcache_lock(changed in Firely release) 
> 2. lfn_find in omap_* methods 
> 3. DBObjectMap header 
> 
> According to my understanding and the docs in 
> ObjectStore.h(https://github.com/ceph/ceph/blob/master/src/os/ObjectStore.h), 
> I simply remove lfn_find in omap_* and fdcache_lock. I'm not fully 
> sure the correctness of this change, but it works well still now. 

"Simply remove"? I don't remember all the details, but I'm sure 
there's more to it than that if you want things to behave. 

> DBObjectMap header patch is on the pull request queue and may be 
> merged in the next feature merge window. 
> 
> With things above done, we get much performance improvement in disk 
> util and benchmark results(3x-4x). 
> 
> Next, we find fdcache size become the main bottleneck. For example, if 
> hot data range is 100GB, we need 25000(100GB/4MB) fd to cache. If hot 
> data range is 1TB, we need 250000(1000GB/4MB) fd to cache. With 
> increase "filestore_fd_cache_size", the cost of lookup(FDCache) and 
> cache miss is expensive and can't be afford. The implementation of 
> FDCache isn't O(1). So we only can get high performance on fdcache hit 
> range(maybe 100GB with 10240 fdcache size) and more data exceed the 
> size of fdcaceh will be disaster. If you want to cache more fd(102400 
> fdcache size), the implementation of FDCache will bring on extra CPU 
> cost(can't be ignore) for each op. 

>From explorations we and others have done, I think what we really want 
to do here is make it cheaper to lookup and open files. The FileStore 
is very much not optimized for this; a single lookup involves 
constructing the path from its components multiple times and I think 
even does the lookups more than once. 
Also, 250k or even 25k file descriptors is an awful lot to demand. ;) 

-Greg 
Software Engineer #42 @ http://inktank.com | http://ceph.com 
-- 
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majordomo@xxxxxxxxxxxxxxx 
More majordomo info at http://vger.kernel.org/majordomo-info.html 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html