Re: performance due to network?

Carlos Capriotti <capriotti.carlos@xxxxxxxxx> · Thu, 12 Jun 2014 23:33:10 +0200

Ah ! I love the smell of disk benchmark in the morning. :) 
Well, just kidding, but this is one of those questions, and I myself was also "pretty sure" I had more than enough I/O, just because the label on the disk said 15K RPM, 3 Go/s, with brand hardware and reputable controllers.

Turns out my "nice" setup performed like (well, I leave it to you to fill in the gaps).

Start from scratch. Test your disk setup with dd.

Try to match your controller's stripe size with your filesystem's block size AND dd's block size when using it.

This is a little of alchemy, and try not killing me too much for some mix-ups,  but it is all worth a shot:

For instance, prepare you RAID volume with 512 K  stripe size. In theory, if you will deal with lots of small files, make this number smaller, and vice-versa. 

Next, don't forget to prepare you filesystem with XFS and -i 512. Not really performance-related, but it is in the manual. Important for Gluster.

Next, the DD black magic:

dd if=/dev/zero of=/your_volume_mount_point/output_file.txt bs=512k count=SOME_NUMBER

(double-check on the 512k, it could be 512kb or similar)

Now, here is how to decide what is SOME_NUMBER:

a) It will write SOME_NUMBER times 512KB of zeroes to your file. Thus, if you use 2, it will make a 1 MByte file.

b) you want your output file to be a little bigger than the amount of RAM on your server, to avoid caching to RAM. Normally this will also avoid caching to the controller's memory. If you have 8 GB of RAM, you will need to use something like 17000.

If you ask me why not making BS=1MB, I can only point back to the "black magic part" and explain that I am matching the 512s. Using 512 everywhere.

So, after a few minutes, dd should - depending on the version you have on your distro - give you a summary of the operation; I recommend also using 

time dd ........

in order to have an idea of time taken by the operation.

This is a very poor way of testing performance, I know, but it is a very good start.

Also, take your time and test your server, to see if it can handle that sort of workload, again with dd:

dd if=/dev/zero of=/dev/null bs=512kb count=whatever_you_want_but_go_big

It will give you a rough idea of how fast your system can process information, of sorts. And see if it is able to cope with your theoretical 10 Gb network ! Mine isn't; Disk controllers will not go over 4 Gb, no matter the disk setup.

If you happen to find out that your disks are performing well, then repeat this for NFS mounts and native fuse/glusterfs mounts.

You might want to test your network before that, true.

Tip: make sure your network is "uncluttered"  (switches and NICs)  and set to jumbo frames (mtu 9000). When configuring my back end, I set the switch (dedicated to storage) as if I were using iSCSI over it. Disabled all the phone crap (oh, there it goes, I finally said it) and extra controls. 

Now, sharing a bit more of "black magic", I've recently learnt that higher temperature (on network cables) can affect performance. I don't expect that to be an issue on a SPF+ style cable like you are - probably - using, but, after reading that I went to my rack only to find out that all of my cables were just behind the bloody power supplies from the servers, and they were very hot and soft.

Guess what: 8 hours downtime to redo all network (1 Gb "regular") and power cabling, which now is all nice, and even has a thermal insulation.

Well, if there was any performance improvement ? I cannot say, but at least this is something else to take into consideration.

Cheers.

On Thu, Jun 12, 2014 at 10:40 PM, Aronesty, Erik <earonesty@xxxxxxxxxxxxxxxxxxxxxx> wrote:

I suspect I'm having performance issues because of network speeds.

Supposedly I have 10gbit connections on all my NAS devices, however, it seems to me that the fastest I can write is 1Gbit.   When I'm copying very large files, etc, I see 'D' as the cp waits to I/O, but when I go the gluster servers,
 I don't see glusterfsd waiting (D) to write to the bricks themselves.  I have 4 nodes, each with  10Gbit connection, each has 2 Areca RAID controllers with 12 disk raid5, and the 2 controllers stripped into 1 large volume.   Pretty sure there's plenty of i/o
 left on the bricks themselves.

Is it possible that "one big file" isn't the right test… should I try 20 big files, and see how saturated my network can get?

Erik Aronesty

Senior Bioinformatics Architect
EA | Quintiles

Genomic Services

4820 Emperor Boulevard
Durham, NC 27703 USA

Office: + 919.287.4011

erik.aronesty@xxxxxxxxxxxxx

www.quintiles.com  

www.expressionanalysis.com

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users