----- Original Message ----- > From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx> > To: "Punit Dambiwal" <hypunit@xxxxxxxxx>, gluster-users@xxxxxxxxxxx, "Humble Devassy Chirammal" > <humble.devassy@xxxxxxxxx> > Sent: Monday, February 16, 2015 3:32:31 PM > Subject: Re: Gluster performance on the small files > > > On 02/12/2015 10:58 PM, Punit Dambiwal wrote: > > > > Hi, > > I have seen the gluster performance is dead slow on the small files...even i > am using the SSD....it's too bad performance....even i am getting better > performance in my SAN with normal SATA disk... > > I am using distributed replicated glusterfs with replica count=2...i have all > SSD disks on the brick... > > > > root@vm3:~# dd bs=64k count=4k if=/dev/zero of=test oflag=dsync > > 4096+0 records in > > 4096+0 records out > > 268435456 bytes (268 MB) copied, 57.3145 s, 4.7 MB/s > This seems pretty slow, even if you are using gigabit. Here is what I get: [root@gqac031 smallfile]# dd bs=64k count=4k if=/dev/zero of=/gluster-emptyvol/test oflag=dsync 4096+0 records in 4096+0 records out 268435456 bytes (268 MB) copied, 10.5965 s, 25.3 MB/s FYI this is on my 2 node pure replica + spinning disks(RAID 6, this is not setup for smallfile workloads. For smallfile I normally use RAID 10) + 10G. The single threaded DD process is defiantly a bottle neck here, the power in distributed systems is doing things in parallel across clients / threads. You may want to try smallfile: http://www.gluster.org/community/documentation/index.php/Performance_Testing Smallfile command used - python /small-files/smallfile/smallfile_cli.py --operation create --threads 8 --file-size 64 --files 10000 --top /gluster-emptyvol/ --pause 1000 --host-set "client1, client2" total threads = 16 total files = 157100 total data = 9.589 GB 98.19% of requested files processed, minimum is 70.00 41.271602 sec elapsed time 3806.491454 files/sec 3806.491454 IOPS 237.905716 MB/sec If you wanted to do something similar with DD you could do: <my script> for i in `seq 1..4` do dd bs=64k count=4k if=/dev/zero of=/gluster-emptyvol/test$i oflag=dsync & done for pid in $(pidof dd); do while kill -0 "$pid"; do sleep 0.1 done done # time myscript.sh Then do the math to figure out the MB / sec of the system. -b > > > root@vm3:~# dd bs=64k count=4k if=/dev/zero of=test conv=fdatasync > > 4096+0 records in > > 4096+0 records out > > 268435456 bytes (268 MB) copied, 1.80093 s, 149 MB/s > > > > How small is your VM image? The image is the file that GlusterFS is serving, > not the small files within it. Perhaps the filesystem you're using within > your VM is inefficient with regard to how it handles disk writes. > > I believe your concept of "small file" performance is misunderstood, as is > often the case with this phrase. The "small file" issue has to do with the > overhead of finding and checking the validity of any file, but with a small > file the percentage of time doing those checks is proportionally greater. > With your VM image, that file is already open. There are no self-heal checks > or lookups that are happening in your tests, so that overhead is not the > problem. > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users