Re: Gluster performance on the small files

Franco Broi <franco.broi@xxxxxxxxxx> · Tue, 17 Feb 2015 11:26:26 +0800

oflag=dsync is going to be really slow on any disk, or am I missing
something here?

On Tue, 2015-02-17 at 10:03 +0800, Punit Dambiwal wrote:
> Hi Vijay,
> 
> 
> Please find the volume info here :- 
> 
> 
> [root@cpu01 ~]# gluster volume info
> 
> 
> Volume Name: ds01
> Type: Distributed-Replicate
> Volume ID: 369d3fdc-c8eb-46b7-a33e-0a49f2451ff6
> Status: Started
> Number of Bricks: 48 x 2 = 96
> Transport-type: tcp
> Bricks:
> Brick1: cpu01:/bricks/1/vol1
> Brick2: cpu02:/bricks/1/vol1
> Brick3: cpu03:/bricks/1/vol1
> Brick4: cpu04:/bricks/1/vol1
> Brick5: cpu01:/bricks/2/vol1
> Brick6: cpu02:/bricks/2/vol1
> Brick7: cpu03:/bricks/2/vol1
> Brick8: cpu04:/bricks/2/vol1
> Brick9: cpu01:/bricks/3/vol1
> Brick10: cpu02:/bricks/3/vol1
> Brick11: cpu03:/bricks/3/vol1
> Brick12: cpu04:/bricks/3/vol1
> Brick13: cpu01:/bricks/4/vol1
> Brick14: cpu02:/bricks/4/vol1
> Brick15: cpu03:/bricks/4/vol1
> Brick16: cpu04:/bricks/4/vol1
> Brick17: cpu01:/bricks/5/vol1
> Brick18: cpu02:/bricks/5/vol1
> Brick19: cpu03:/bricks/5/vol1
> Brick20: cpu04:/bricks/5/vol1
> Brick21: cpu01:/bricks/6/vol1
> Brick22: cpu02:/bricks/6/vol1
> Brick23: cpu03:/bricks/6/vol1
> Brick24: cpu04:/bricks/6/vol1
> Brick25: cpu01:/bricks/7/vol1
> Brick26: cpu02:/bricks/7/vol1
> Brick27: cpu03:/bricks/7/vol1
> Brick28: cpu04:/bricks/7/vol1
> Brick29: cpu01:/bricks/8/vol1
> Brick30: cpu02:/bricks/8/vol1
> Brick31: cpu03:/bricks/8/vol1
> Brick32: cpu04:/bricks/8/vol1
> Brick33: cpu01:/bricks/9/vol1
> Brick34: cpu02:/bricks/9/vol1
> Brick35: cpu03:/bricks/9/vol1
> Brick36: cpu04:/bricks/9/vol1
> Brick37: cpu01:/bricks/10/vol1
> Brick38: cpu02:/bricks/10/vol1
> Brick39: cpu03:/bricks/10/vol1
> Brick40: cpu04:/bricks/10/vol1
> Brick41: cpu01:/bricks/11/vol1
> Brick42: cpu02:/bricks/11/vol1
> Brick43: cpu03:/bricks/11/vol1
> Brick44: cpu04:/bricks/11/vol1
> Brick45: cpu01:/bricks/12/vol1
> Brick46: cpu02:/bricks/12/vol1
> Brick47: cpu03:/bricks/12/vol1
> Brick48: cpu04:/bricks/12/vol1
> Brick49: cpu01:/bricks/13/vol1
> Brick50: cpu02:/bricks/13/vol1
> Brick51: cpu03:/bricks/13/vol1
> Brick52: cpu04:/bricks/13/vol1
> Brick53: cpu01:/bricks/14/vol1
> Brick54: cpu02:/bricks/14/vol1
> Brick55: cpu03:/bricks/14/vol1
> Brick56: cpu04:/bricks/14/vol1
> Brick57: cpu01:/bricks/15/vol1
> Brick58: cpu02:/bricks/15/vol1
> Brick59: cpu03:/bricks/15/vol1
> Brick60: cpu04:/bricks/15/vol1
> Brick61: cpu01:/bricks/16/vol1
> Brick62: cpu02:/bricks/16/vol1
> Brick63: cpu03:/bricks/16/vol1
> Brick64: cpu04:/bricks/16/vol1
> Brick65: cpu01:/bricks/17/vol1
> Brick66: cpu02:/bricks/17/vol1
> Brick67: cpu03:/bricks/17/vol1
> Brick68: cpu04:/bricks/17/vol1
> Brick69: cpu01:/bricks/18/vol1
> Brick70: cpu02:/bricks/18/vol1
> Brick71: cpu03:/bricks/18/vol1
> Brick72: cpu04:/bricks/18/vol1
> Brick73: cpu01:/bricks/19/vol1
> Brick74: cpu02:/bricks/19/vol1
> Brick75: cpu03:/bricks/19/vol1
> Brick76: cpu04:/bricks/19/vol1
> Brick77: cpu01:/bricks/20/vol1
> Brick78: cpu02:/bricks/20/vol1
> Brick79: cpu03:/bricks/20/vol1
> Brick80: cpu04:/bricks/20/vol1
> Brick81: cpu01:/bricks/21/vol1
> Brick82: cpu02:/bricks/21/vol1
> Brick83: cpu03:/bricks/21/vol1
> Brick84: cpu04:/bricks/21/vol1
> Brick85: cpu01:/bricks/22/vol1
> Brick86: cpu02:/bricks/22/vol1
> Brick87: cpu03:/bricks/22/vol1
> Brick88: cpu04:/bricks/22/vol1
> Brick89: cpu01:/bricks/23/vol1
> Brick90: cpu02:/bricks/23/vol1
> Brick91: cpu03:/bricks/23/vol1
> Brick92: cpu04:/bricks/23/vol1
> Brick93: cpu01:/bricks/24/vol1
> Brick94: cpu02:/bricks/24/vol1
> Brick95: cpu03:/bricks/24/vol1
> Brick96: cpu04:/bricks/24/vol1
> Options Reconfigured:
> nfs.disable: off
> user.cifs: enable
> auth.allow: 10.10.0.*
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> cluster.eager-lock: enable
> network.remote-dio: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> storage.owner-uid: 36
> storage.owner-gid: 36
> server.allow-insecure: on
> [root@cpu01 ~]#
> 
> 
> Thanks,
> punit
> 
> On Tue, Feb 17, 2015 at 6:16 AM, Ben Turner <bturner@xxxxxxxxxx>
> wrote:
>         ----- Original Message -----
>         > From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx>
>         > To: "Punit Dambiwal" <hypunit@xxxxxxxxx>,
>         gluster-users@xxxxxxxxxxx, "Humble Devassy Chirammal"
>         > <humble.devassy@xxxxxxxxx>
>         > Sent: Monday, February 16, 2015 3:32:31 PM
>         > Subject: Re:  Gluster performance on the
>         small files
>         >
>         >
>         > On 02/12/2015 10:58 PM, Punit Dambiwal wrote:
>         >
>         >
>         >
>         > Hi,
>         >
>         > I have seen the gluster performance is dead slow on the
>         small files...even i
>         > am using the SSD....it's too bad performance....even i am
>         getting better
>         > performance in my SAN with normal SATA disk...
>         >
>         > I am using distributed replicated glusterfs with replica
>         count=2...i have all
>         > SSD disks on the brick...
>         >
>         >
>         >
>         > root@vm3:~# dd bs=64k count=4k if=/dev/zero of=test
>         oflag=dsync
>         >
>         > 4096+0 records in
>         >
>         > 4096+0 records out
>         >
>         > 268435456 bytes (268 MB) copied, 57.3145 s, 4.7 MB/s
>         >
>         
>         This seems pretty slow, even if you are using gigabit.  Here
>         is what I get:
>         
>         [root@gqac031 smallfile]# dd bs=64k count=4k if=/dev/zero
>         of=/gluster-emptyvol/test oflag=dsync
>         4096+0 records in
>         4096+0 records out
>         268435456 bytes (268 MB) copied, 10.5965 s, 25.3 MB/s
>         
>         FYI this is on my 2 node pure replica + spinning disks(RAID 6,
>         this is not setup for smallfile workloads.  For smallfile I
>         normally use RAID 10) + 10G.
>         
>         The single threaded DD process is defiantly a bottle neck
>         here, the power in distributed systems is doing things in
>         parallel across clients / threads.  You may want to try
>         smallfile:
>         
>         http://www.gluster.org/community/documentation/index.php/Performance_Testing
>         
>         Smallfile command used -
>         python /small-files/smallfile/smallfile_cli.py --operation
>         create --threads 8 --file-size 64 --files 10000
>         --top /gluster-emptyvol/ --pause 1000 --host-set "client1,
>         client2"
>         
>         total threads = 16
>         total files = 157100
>         total data =     9.589 GB
>          98.19% of requested files processed, minimum is  70.00
>         41.271602 sec elapsed time
>         3806.491454 files/sec
>         3806.491454 IOPS
>         237.905716 MB/sec
>         
>         If you wanted to do something similar with DD you could do:
>         
>         <my script>
>         for i in `seq 1..4`
>         do
>             dd bs=64k count=4k if=/dev/zero of=/gluster-emptyvol/test
>         $i oflag=dsync &
>         done
>         for pid in $(pidof dd); do
>             while kill -0 "$pid"; do
>                 sleep 0.1
>             done
>         done
>         
>         # time myscript.sh
>         
>         Then do the math to figure out the MB / sec of the system.
>         
>         -b
>         
>         >
>         >
>         > root@vm3:~# dd bs=64k count=4k if=/dev/zero of=test
>         conv=fdatasync
>         >
>         > 4096+0 records in
>         >
>         > 4096+0 records out
>         >
>         > 268435456 bytes (268 MB) copied, 1.80093 s, 149 MB/s
>         >
>         >
>         >
>         > How small is your VM image? The image is the file that
>         GlusterFS is serving,
>         > not the small files within it. Perhaps the filesystem you're
>         using within
>         > your VM is inefficient with regard to how it handles disk
>         writes.
>         >
>         > I believe your concept of "small file" performance is
>         misunderstood, as is
>         > often the case with this phrase. The "small file" issue has
>         to do with the
>         > overhead of finding and checking the validity of any file,
>         but with a small
>         > file the percentage of time doing those checks is
>         proportionally greater.
>         > With your VM image, that file is already open. There are no
>         self-heal checks
>         > or lookups that are happening in your tests, so that
>         overhead is not the
>         > problem.
>         >
>         > _______________________________________________
>         > Gluster-users mailing list
>         > Gluster-users@xxxxxxxxxxx
>         > http://www.gluster.org/mailman/listinfo/gluster-users
>         
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users