gluster local vs local = gluster x4 slower

ian.rogers at contactclean.com (Ian Rogers) · Mon, 29 Mar 2010 19:47:18 +0100



Have you guys seen the wiki page - 
http://www.gluster.com/community/documentation/index.php/GlusterFS_2.0_Benchmark_Results_%28compared_with_NFS%29 
- it would be interesting if you could replicate the "Single GlusterFS" 
section to see how things have changed...

Ian 
<http://www.gluster.com/community/documentation/index.php/GlusterFS_2.0_Benchmark_Results_%28compared_with_NFS%29#Single_GlusterFS_.26_NFS_Test_Platform>


On 29/03/2010 19:21, Jeremy Enos wrote:
> Got a chance to run your suggested test:
>
> ##############GLUSTER SINGLE DISK##############
>
> [root at ac33 gjenos]# dd bs=4096 count=32768 if=/dev/zero 
> of=./filename.test
> 32768+0 records in
> 32768+0 records out
> 134217728 bytes (134 MB) copied, 8.60486 s, 15.6 MB/s
> [root at ac33 gjenos]#
> [root at ac33 gjenos]# cd /export/jenos/
>
> ##############DIRECT SINGLE DISK##############
>
> [root at ac33 jenos]# dd bs=4096 count=32768 if=/dev/zero of=./filename.test
> 32768+0 records in
> 32768+0 records out
> 134217728 bytes (134 MB) copied, 0.21915 s, 612 MB/s
> [root at ac33 jenos]#
>
> If doing anything that can see a cache benefit, the performance of 
> Gluster can't compare.  Is it even using cache?
>
> This is the client vol file I used for that test:
>
> [root at ac33 jenos]# cat /etc/glusterfs/ghome.vol
> #-----------IB remotes------------------
> volume ghome
>   type protocol/client
>   option transport-type tcp/client
>   option remote-host ac33
>   option remote-subvolume ibstripe
> end-volume
>
> #------------Performance Options-------------------
>
> volume readahead
>   type performance/read-ahead
>   option page-count 4           # 2 is default option
>   option force-atime-update off # default is off
>   subvolumes ghome
> end-volume
>
> volume writebehind
>   type performance/write-behind
>   option cache-size 1MB
>   subvolumes readahead
> end-volume
>
> volume cache
>   type performance/io-cache
>   option cache-size 2GB
>   subvolumes writebehind
> end-volume
>
>
> Any suggestions appreciated.  thx-
>
>     Jeremy
>
> On 3/26/2010 6:09 PM, Bryan Whitehead wrote:
>> One more thought, looks like (from your emails) you are always running
>> the gluster test first. Maybe the tar file is being read from disk
>> when you do the gluster test, then being read from cache when you run
>> for the disk.
>>
>> What if you just pull a chunk of 0's off /dev/zero?
>>
>> dd bs=4096 count=32768 if=/dev/zero of=./filename.test
>>
>> or stick the tar in a ramdisk?
>>
>> (or run the benchmark 10 times for each, drop the best and the worse,
>> and average the remaining 8)
>>
>> Would also be curious if you add another node if the time would be
>> halved, then add another 2... then it would be halved again? I guess
>> that depends on if striping or just replicating is being used.
>> (unfortunately I don't have access to more than 1 test box right now).
>>
>> On Wed, Mar 24, 2010 at 11:06 PM, Jeremy Enos<jenos at ncsa.uiuc.edu>  
>> wrote:
>>> For completeness:
>>>
>>> ##############GLUSTER SINGLE DISK NO PERFORMANCE OPTIONS##############
>>> [root at ac33 gjenos]# time (tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&  sync )
>>>
>>> real    0m41.052s
>>> user    0m7.705s
>>> sys     0m3.122s
>>> ##############DIRECT SINGLE DISK##############
>>> [root at ac33 gjenos]# cd /export/jenos
>>> [root at ac33 jenos]# time (tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&  sync )
>>>
>>> real    0m22.093s
>>> user    0m6.932s
>>> sys     0m2.459s
>>> [root at ac33 jenos]#
>>>
>>> The performance options don't appear to be the problem.  So the 
>>> question
>>> stands- how do I get the disk cache advantage through the Gluster 
>>> mounted
>>> filesystem?  It seems to be key in the large performance difference.
>>>
>>>     Jeremy
>>>
>>> On 3/24/2010 4:47 PM, Jeremy Enos wrote:
>>>> Good suggestion- I hadn't tried that yet.  It brings them much closer.
>>>>
>>>> ##############GLUSTER SINGLE DISK##############
>>>> [root at ac33 gjenos]# time (tar xzf
>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&  sync )
>>>>
>>>> real    0m32.089s
>>>> user    0m6.516s
>>>> sys     0m3.177s
>>>> ##############DIRECT SINGLE DISK##############
>>>> [root at ac33 gjenos]# cd /export/jenos/
>>>> [root at ac33 jenos]# time (tar xzf
>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&  sync )
>>>>
>>>> real    0m25.089s
>>>> user    0m6.850s
>>>> sys     0m2.058s
>>>> ##############DIRECT SINGLE DISK CACHED##############
>>>> [root at ac33 jenos]# time (tar xzf
>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz )
>>>>
>>>> real    0m8.955s
>>>> user    0m6.785s
>>>> sys     0m1.848s
>>>>
>>>>
>>>> Oddly, I'm seeing better performance on the gluster system than 
>>>> previous
>>>> tests too (used to be ~39 s).  The direct disk time is obviously 
>>>> benefiting
>>>> from cache.  There is still a difference, but it appears most of the
>>>> difference disappears w/ the cache advantage removed.  That said- the
>>>> relative performance issue then still exists with Gluster.  What 
>>>> can be done
>>>> to make it benefit from cache the same way direct disk does?
>>>> thx-
>>>>
>>>>     Jeremy
>>>>
>>>> P.S.
>>>> I'll be posting results w/ performance options completely removed from
>>>> gluster as soon as I get a chance.
>>>>
>>>>     Jeremy
>>>>
>>>> On 3/24/2010 4:23 PM, Bryan Whitehead wrote:
>>>>> I'd like to see results with this:
>>>>>
>>>>> time ( tar xzf /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz&&
>>>>>   sync )
>>>>>
>>>>> I've found local filesystems seem to use cache very heavily. The
>>>>> untarred file could mostly be sitting in ram with local fs vs going
>>>>> though fuse (which might do many more sync'ed flushes to disk?).
>>>>>
>>>>> On Wed, Mar 24, 2010 at 2:25 AM, Jeremy 
>>>>> Enos<jenos at ncsa.uiuc.edu>    wrote:
>>>>>> I also neglected to mention that the underlying filesystem is ext3.
>>>>>>
>>>>>> On 3/24/2010 3:44 AM, Jeremy Enos wrote:
>>>>>>> I haven't tried all performance options disabled yet- I can try 
>>>>>>> that
>>>>>>> tomorrow when the resource frees up.  I was actually asking first
>>>>>>> before
>>>>>>> blindly trying different configuration matrices in case there's 
>>>>>>> a clear
>>>>>>> direction I should take with it.  I'll let you know.
>>>>>>>
>>>>>>>     Jeremy
>>>>>>>
>>>>>>> On 3/24/2010 2:54 AM, Stephan von Krawczynski wrote:
>>>>>>>> Hi Jeremy,
>>>>>>>>
>>>>>>>> have you tried to reproduce with all performance options disabled?
>>>>>>>> They
>>>>>>>> are
>>>>>>>> possibly no good idea on a local system.
>>>>>>>> What local fs do you use?
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Regards,
>>>>>>>> Stephan
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, 23 Mar 2010 19:11:28 -0500
>>>>>>>> Jeremy Enos<jenos at ncsa.uiuc.edu>      wrote:
>>>>>>>>
>>>>>>>>> Stephan is correct- I primarily did this test to show a 
>>>>>>>>> demonstrable
>>>>>>>>> overhead example that I'm trying to eliminate.  It's pronounced
>>>>>>>>> enough
>>>>>>>>> that it can be seen on a single disk / single node configuration,
>>>>>>>>> which
>>>>>>>>> is good in a way (so anyone can easily repro).
>>>>>>>>>
>>>>>>>>> My distributed/clustered solution would be ideal if it were fast
>>>>>>>>> enough
>>>>>>>>> for small block i/o as well as large block- I was hoping that 
>>>>>>>>> single
>>>>>>>>> node systems would achieve that, hence the single node test.  
>>>>>>>>> Because
>>>>>>>>> the single node test performed poorly, I eventually reduced 
>>>>>>>>> down to
>>>>>>>>> single disk to see if it could still be seen, and it clearly 
>>>>>>>>> can be.
>>>>>>>>> Perhaps it's something in my configuration?  I've pasted my 
>>>>>>>>> config
>>>>>>>>> files
>>>>>>>>> below.
>>>>>>>>> thx-
>>>>>>>>>
>>>>>>>>>       Jeremy
>>>>>>>>>
>>>>>>>>> ######################glusterfsd.vol######################
>>>>>>>>> volume posix
>>>>>>>>>     type storage/posix
>>>>>>>>>     option directory /export
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume locks
>>>>>>>>>     type features/locks
>>>>>>>>>     subvolumes posix
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume disk
>>>>>>>>>     type performance/io-threads
>>>>>>>>>     option thread-count 4
>>>>>>>>>     subvolumes locks
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume server-ib
>>>>>>>>>     type protocol/server
>>>>>>>>>     option transport-type ib-verbs/server
>>>>>>>>>     option auth.addr.disk.allow *
>>>>>>>>>     subvolumes disk
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume server-tcp
>>>>>>>>>     type protocol/server
>>>>>>>>>     option transport-type tcp/server
>>>>>>>>>     option auth.addr.disk.allow *
>>>>>>>>>     subvolumes disk
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> ######################ghome.vol######################
>>>>>>>>>
>>>>>>>>> #-----------IB remotes------------------
>>>>>>>>> volume ghome
>>>>>>>>>     type protocol/client
>>>>>>>>>     option transport-type ib-verbs/client
>>>>>>>>> #  option transport-type tcp/client
>>>>>>>>>     option remote-host acfs
>>>>>>>>>     option remote-subvolume raid
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> #------------Performance Options-------------------
>>>>>>>>>
>>>>>>>>> volume readahead
>>>>>>>>>     type performance/read-ahead
>>>>>>>>>     option page-count 4           # 2 is default option
>>>>>>>>>     option force-atime-update off # default is off
>>>>>>>>>     subvolumes ghome
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume writebehind
>>>>>>>>>     type performance/write-behind
>>>>>>>>>     option cache-size 1MB
>>>>>>>>>     subvolumes readahead
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> volume cache
>>>>>>>>>     type performance/io-cache
>>>>>>>>>     option cache-size 1GB
>>>>>>>>>     subvolumes writebehind
>>>>>>>>> end-volume
>>>>>>>>>
>>>>>>>>> ######################END######################
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/23/2010 6:02 AM, Stephan von Krawczynski wrote:
>>>>>>>>>> On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
>>>>>>>>>> "Tejas N. Bhise"<tejas at gluster.com>       wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Out of curiosity, if you want to do stuff only on one machine,
>>>>>>>>>>> why do you want to use a distributed, multi node, clustered,
>>>>>>>>>>> file system ?
>>>>>>>>>>>
>>>>>>>>>> Because what he does is a very good way to show the overhead
>>>>>>>>>> produced
>>>>>>>>>> only by
>>>>>>>>>> glusterfs and nothing else (i.e. no network involved).
>>>>>>>>>> A pretty relevant test scenario I would say.
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Regards,
>>>>>>>>>> Stephan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Am I missing something here ?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Tejas.
>>>>>>>>>>>
>>>>>>>>>>> ----- Original Message -----
>>>>>>>>>>> From: "Jeremy Enos"<jenos at ncsa.uiuc.edu>
>>>>>>>>>>> To: gluster-users at gluster.org
>>>>>>>>>>> Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai,
>>>>>>>>>>> Kolkata,
>>>>>>>>>>> Mumbai, New Delhi
>>>>>>>>>>> Subject: gluster local vs local = gluster x4 
>>>>>>>>>>> slower
>>>>>>>>>>>
>>>>>>>>>>> This test is pretty easy to replicate anywhere- only takes 1 
>>>>>>>>>>> disk,
>>>>>>>>>>> one
>>>>>>>>>>> machine, one tarball.  Untarring to local disk directly vs thru
>>>>>>>>>>> gluster
>>>>>>>>>>> is about 4.5x faster.  At first I thought this may be due to 
>>>>>>>>>>> a slow
>>>>>>>>>>> host
>>>>>>>>>>> (Opteron 2.4ghz).  But it's not- same configuration, on a much
>>>>>>>>>>> faster
>>>>>>>>>>> machine (dual 3.33ghz Xeon) yields the performance below.
>>>>>>>>>>>
>>>>>>>>>>> ####THIS TEST WAS TO A LOCAL DISK THRU GLUSTER####
>>>>>>>>>>> [root at ac33 jenos]# time tar xzf
>>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>>
>>>>>>>>>>> real    0m41.290s
>>>>>>>>>>> user    0m14.246s
>>>>>>>>>>> sys     0m2.957s
>>>>>>>>>>>
>>>>>>>>>>> ####THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)####
>>>>>>>>>>> [root at ac33 jenos]# cd /export/jenos/
>>>>>>>>>>> [root at ac33 jenos]# time tar xzf
>>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>>
>>>>>>>>>>> real    0m8.983s
>>>>>>>>>>> user    0m6.857s
>>>>>>>>>>> sys     0m1.844s
>>>>>>>>>>>
>>>>>>>>>>> ####THESE ARE TEST FILE DETAILS####
>>>>>>>>>>> [root at ac33 jenos]# tar tzvf
>>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz  |wc -l
>>>>>>>>>>> 109
>>>>>>>>>>> [root at ac33 jenos]# ls -l
>>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>> -rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32
>>>>>>>>>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>>>>>>>>> [root at ac33 jenos]#
>>>>>>>>>>>
>>>>>>>>>>> These are the relevant performance options I'm using in my .vol
>>>>>>>>>>> file:
>>>>>>>>>>>
>>>>>>>>>>> #------------Performance Options-------------------
>>>>>>>>>>>
>>>>>>>>>>> volume readahead
>>>>>>>>>>>      type performance/read-ahead
>>>>>>>>>>>      option page-count 4           # 2 is default option
>>>>>>>>>>>      option force-atime-update off # default is off
>>>>>>>>>>>      subvolumes ghome
>>>>>>>>>>> end-volume
>>>>>>>>>>>
>>>>>>>>>>> volume writebehind
>>>>>>>>>>>      type performance/write-behind
>>>>>>>>>>>      option cache-size 1MB
>>>>>>>>>>>      subvolumes readahead
>>>>>>>>>>> end-volume
>>>>>>>>>>>
>>>>>>>>>>> volume cache
>>>>>>>>>>>      type performance/io-cache
>>>>>>>>>>>      option cache-size 1GB
>>>>>>>>>>>      subvolumes writebehind
>>>>>>>>>>> end-volume
>>>>>>>>>>>
>>>>>>>>>>> What can I do to improve gluster's performance?
>>>>>>>>>>>
>>>>>>>>>>>        Jeremy
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>> Gluster-users at gluster.org
>>>>>>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users