gluster local vs local = gluster x4 slower

jenos at ncsa.uiuc.edu (Jeremy Enos) · Wed, 24 Mar 2010 01:57:58 -0500

I'll do my best to describe it.  I have QDR IB between each host.  I'd 
like a very large filesystem that's reliable and high performance for 
both large and small files- but coming back to reality, I don't actually 
expect to be able to get all those features in a single filesystem, but 
perhaps 2.  It's an HPC cluster that the fileystem is required for- so 
it really does span large and small, at various reliability 
requirements.  My estimation is that I could do it in two filesystems- 
one unreliable high i/o (/scratch) and one reliable medium i/o (/home).

/scratch would not be required to be reliable, but if spread across 
enough nodes- AFR may be prudent, as I have it configured now.  This 
configuration has already yielded good aggregate i/o for large block 
transfers.  Small block is quite slow though.  If I do span multiple 
hosts, I can't dedicate those hosts to it- so the configuration would be 
spread wide to distribute i/o load as thin as possible too.  If I can't 
get good small block i/o performance out of the same filesystem, I'm ok 
with using /home for that.

/home would need to be reliable of course- and also medium performance 
for large block and medium to high performance for small block i/o.  I 
thought centralizing disks to a single host would help achieve this, but 
it has not- the small block overhead is apparently not as much due to 
distribution over hosts as it is to gluster itself.
Or my specific configuration perhaps.
thanks very much for all your help here-

     Jeremy

On 3/23/2010 10:27 PM, Tejas N. Bhise wrote:
> It might also be useful overall to know what you want to achieve. Its better to do sizing, performance etc if there is clarity on what is to be achieved. Once that is clear, it would be more useful to say if something is possible or not with the config you are trying and why or why not and whether even the expectations are justified or not from what is essentially a distributed networked FS.
>
>
> ----- Original Message -----
> From: "Jeremy Enos"<jenos at ncsa.uiuc.edu>
> To: "Stephan von Krawczynski"<skraw at ithnet.com>
> Cc: "Tejas N. Bhise"<tejas at gluster.com>, gluster-users at gluster.org
> Sent: Wednesday, March 24, 2010 5:41:28 AM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
> Subject: Re: gluster local vs local = gluster x4 slower
>
> Stephan is correct- I primarily did this test to show a demonstrable
> overhead example that I'm trying to eliminate.  It's pronounced enough
> that it can be seen on a single disk / single node configuration, which
> is good in a way (so anyone can easily repro).
>
> My distributed/clustered solution would be ideal if it were fast enough
> for small block i/o as well as large block- I was hoping that single
> node systems would achieve that, hence the single node test.  Because
> the single node test performed poorly, I eventually reduced down to
> single disk to see if it could still be seen, and it clearly can be.
> Perhaps it's something in my configuration?  I've pasted my config files
> below.
> thx-
>
>       Jeremy
>
> ######################glusterfsd.vol######################
> volume posix
>     type storage/posix
>     option directory /export
> end-volume
>
> volume locks
>     type features/locks
>     subvolumes posix
> end-volume
>
> volume disk
>     type performance/io-threads
>     option thread-count 4
>     subvolumes locks
> end-volume
>
> volume server-ib
>     type protocol/server
>     option transport-type ib-verbs/server
>     option auth.addr.disk.allow *
>     subvolumes disk
> end-volume
>
> volume server-tcp
>     type protocol/server
>     option transport-type tcp/server
>     option auth.addr.disk.allow *
>     subvolumes disk
> end-volume
>
> ######################ghome.vol######################
>
> #-----------IB remotes------------------
> volume ghome
>     type protocol/client
>     option transport-type ib-verbs/client
> #  option transport-type tcp/client
>     option remote-host acfs
>     option remote-subvolume raid
> end-volume
>
> #------------Performance Options-------------------
>
> volume readahead
>     type performance/read-ahead
>     option page-count 4           # 2 is default option
>     option force-atime-update off # default is off
>     subvolumes ghome
> end-volume
>
> volume writebehind
>     type performance/write-behind
>     option cache-size 1MB
>     subvolumes readahead
> end-volume
>
> volume cache
>     type performance/io-cache
>     option cache-size 1GB
>     subvolumes writebehind
> end-volume
>
> ######################END######################
>
>
>
> On 3/23/2010 6:02 AM, Stephan von Krawczynski wrote:
>    
>> On Tue, 23 Mar 2010 02:59:35 -0600 (CST)
>> "Tejas N. Bhise"<tejas at gluster.com>   wrote:
>>
>>
>>      
>>> Out of curiosity, if you want to do stuff only on one machine,
>>> why do you want to use a distributed, multi node, clustered,
>>> file system ?
>>>
>>>        
>> Because what he does is a very good way to show the overhead produced only by
>> glusterfs and nothing else (i.e. no network involved).
>> A pretty relevant test scenario I would say.
>>
>> --
>> Regards,
>> Stephan
>>
>>
>>
>>      
>>> Am I missing something here ?
>>>
>>> Regards,
>>> Tejas.
>>>
>>> ----- Original Message -----
>>> From: "Jeremy Enos"<jenos at ncsa.uiuc.edu>
>>> To: gluster-users at gluster.org
>>> Sent: Tuesday, March 23, 2010 2:07:06 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
>>> Subject: gluster local vs local = gluster x4 slower
>>>
>>> This test is pretty easy to replicate anywhere- only takes 1 disk, one
>>> machine, one tarball.  Untarring to local disk directly vs thru gluster
>>> is about 4.5x faster.  At first I thought this may be due to a slow host
>>> (Opteron 2.4ghz).  But it's not- same configuration, on a much faster
>>> machine (dual 3.33ghz Xeon) yields the performance below.
>>>
>>> ####THIS TEST WAS TO A LOCAL DISK THRU GLUSTER####
>>> [root at ac33 jenos]# time tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>
>>> real    0m41.290s
>>> user    0m14.246s
>>> sys     0m2.957s
>>>
>>> ####THIS TEST WAS TO A LOCAL DISK (BYPASS GLUSTER)####
>>> [root at ac33 jenos]# cd /export/jenos/
>>> [root at ac33 jenos]# time tar xzf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>>
>>> real    0m8.983s
>>> user    0m6.857s
>>> sys     0m1.844s
>>>
>>> ####THESE ARE TEST FILE DETAILS####
>>> [root at ac33 jenos]# tar tzvf
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz  |wc -l
>>> 109
>>> [root at ac33 jenos]# ls -l
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>> -rw-r--r-- 1 jenos ac 804385203 2010-02-07 06:32
>>> /scratch/jenos/intel/l_cproc_p_11.1.064_intel64.tgz
>>> [root at ac33 jenos]#
>>>
>>> These are the relevant performance options I'm using in my .vol file:
>>>
>>> #------------Performance Options-------------------
>>>
>>> volume readahead
>>>      type performance/read-ahead
>>>      option page-count 4           # 2 is default option
>>>      option force-atime-update off # default is off
>>>      subvolumes ghome
>>> end-volume
>>>
>>> volume writebehind
>>>      type performance/write-behind
>>>      option cache-size 1MB
>>>      subvolumes readahead
>>> end-volume
>>>
>>> volume cache
>>>      type performance/io-cache
>>>      option cache-size 1GB
>>>      subvolumes writebehind
>>> end-volume
>>>
>>> What can I do to improve gluster's performance?
>>>
>>>        Jeremy
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users at gluster.org
>>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>>
>>>
>>>        
>>
>>      
>