On 04/20/2011 02:29 PM, Mohit Anchlia wrote: > Please find > > > [root at dsdb1 ~]# cat /proc/sys/vm/drop_caches > 3 > [root at dsdb1 ~]# dd if=/dev/zero of=/data/big.file bs=128k count=80k oflag=direct > > 81920+0 records in > 81920+0 records out > 10737418240 bytes (11 GB) copied, 521.553 seconds, 20.6 MB/s Suddenly this makes a great deal more sense. > [root at dsdb1 ~]# > [root at dsdb1 ~]# dd if=/dev/zero of=/data/big.file bs=128k count=80k iflag=direct > dd: opening `/dev/zero': Invalid argument > [root at dsdb1 ~]# dd of=/dev/null if=/data/big.file bs=128k iflag=direct > 81920+0 records in > 81920+0 records out > 10737418240 bytes (11 GB) copied, 37.854 seconds, 284 MB/s > [root at dsdb1 ~]# About what I expected. Ok. Uncached OS writes get you to 20MB/s. Which is about what you are seeing with the fuse mount and a dd. So I think we understand the write side. The read side is about where I expected (lower actually, but not by enough that I am concerned). You can try changing bs=2M count=6k on both to see the effect of larger blocks. You should get some improvement. I think we need to dig into the details of that RAID0 construction now. This might be something better done offlist (unless everyone wants to see the gory details of digging into the hardware side). My current thought is that this is a hardware issue, and not a gluster issue per se, but that there are possibilities for improving performance on the gluster side of the equation. Short version: PERC is not fast (never has been), and it is often a bad choice for high performance. You are often better off building an MD RAID using the software tools in Linux, it will be faster. Think of PERC as an HBA with some modicum of built in RAID capability. You don't really want to use that capability if possible, but you do want to use the HBA. Longer version: Likely a striping issue, or a caching issue (need to see battery state, cache size, etc.), not to mention the slow chip. Are the disk write caches off or on (guessing off which is the right thing to do for some workloads but it does impact performance). Also, the RAID cpu in PERC (its a rebadged LSI) is very low performance in general, and specifically not terribly good even at RAID0. These are direct writes, skipping OS cache. They will let you see how fast the underlying hardware is, and if it can handle the amount of data you want to shove onto disks. Here is my desktop: root at metal:/local2/home/landman# dd if=/dev/zero of=/local2/big.file bs=128k count=80k oflag=direct 81920+0 records in 81920+0 records out 10737418240 bytes (11 GB) copied, 64.7407 s, 166 MB/s root at metal:/local2/home/landman# dd if=/dev/zero of=/local2/big.file bs=2M count=6k oflag=direct 6144+0 records in 6144+0 records out 12884901888 bytes (13 GB) copied, 86.0184 s, 150 MB/s and a server in the lab [root at jr5-1 ~]# dd if=/dev/zero of=/data/big.file bs=128k count=80k oflag=direct 81920+0 records in 81920+0 records out 10737418240 bytes (11 GB) copied, 11.0948 seconds, 968 MB/s [root at jr5-1 ~]# dd if=/dev/zero of=/data/big.file bs=2M count=6k oflag=direct 6144+0 records in 6144+0 records out 12884901888 bytes (13 GB) copied, 5.11935 seconds, 2.5 GB/s Gluster will not be faster than the bare metal (silicon). It may hide some of the issues with caching. But it is bounded by how fast you can push to or pull bits from the media. In an "optimal" config, the 4x SAS 10k RPM drives should be able to sustain ~600 MB/s write. Reality will be less than this, guessing 250-400 MB/s in most cases. This is still pretty low in performance. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615