Hi, I think you are up to an interesting project. May be you could share a few more details. (1) What cloud are you planning to use, EC2 with EBS volumes or some hosted stuff like Rackspace? (2) What are your motivations for using RAID10 (for example on Amazon that would increase your monthly price from $10k to $20k just for storage not counting io operations --- I am not suggesting you use raid0, btw) (3) is this for something like a web farm with one unix user accessing the web farm or is it a multi-user HPC like environment for which you need a Posix file system? So far the discussion has been focusing on XFS vs ZFS. I admit that I am a fan of ZFS and I have only used XFS for performance reasons on mysql servers where it did well. When I read something like this http://oss.sgi.com/archives/xfs/2011-08/msg00320.html that makes me not want to use XFS for big data. You can assume that this is a real recent bug because Joe is a smart guy who knows exactly what he is doing. Joe and the Gluster guys are vendors who can work around these issues and provide support. If XFS is the choice, may be you should hire them for this gig. ZFS typically does not have these FS repair issues in the first place. The motivation of Lawrence Livermore for porting ZFS to Linux was quite clear: http://zfsonlinux.org/docs/SC10_BoF_ZFS_on_Linux_for_Lustre.pdf OK, they have 50PB and we are talking about much smaller deployments. However some of the limitations they report I can confirm. Also, recovering from a drive failure with this whole LVM/Linux Raid stuff is unpredictable. Hot swapping does not always work and if you prioritize the re-sync of data to the new drive you can strangle the entire box (by default the priority of the re-sync process is low on linux). If you are a Linux expert you can handle this kind of stuff (or hire someone) but if you ever want to give this setup to a Storage Administrator you better give them something that they can use with confidence (may be less of an issue in the cloud). Compare to this to ZFS: re-silvering works with a very predictable result and timing. There is a ton of info out there on this topic. I think that gluster users may be getting around many of the linux raid issues by simply taking the entire node down (which is ok in mirrored node settings) or by using hardware raid controllers. (which are often not available in the cloud ) Some in the Linux community seem to be slightly opposed to ZFS (I assume because of the licensing issue) and make sometimes odd suggestions ("You should use BTRFS"). As someone who is involved with managing hundreds of terabytes of storage I can say that if something goes wrong with a big hunk of your storage it quite a pain to get it back. I would only feel comfortable to use a combination of gluster with anything as my primary storage if I had it mirrored to another datacenter not using gluster technology for the mirroring (hence my raid10 question, or may be that is what you are planning). Primary storage of that size without mirroring I would put on a commercial thing like isilon, IBM, Bluearc where I get 24*7 support, etc. We are currently happy users of glusterfs and we are using it as a caching tier for hpc (our users have managed to bring it down) and for backup and I love it for that. We are currently testing ZFSOnLinux with gluster 3.2.3 on good hardware (8 core, 64 GB, SSD for caching) with ultra cheap drives (WD green caviar) and the performance results are very impressive. I am currently not too concerned with stability. Should the kernel crash (which has not happened yet) the data will be unaffected because no linux code (the Solaris porting layer) is actually touching any of the hard drives. dipe On Sat, Sep 24, 2011 at 5:10 AM, RDP <rdp.com at gmail.com> wrote: > Hello, > ? May be this question would have been addressed elsewhere but I did like > the opinion and experience of other users. > There could be some misconceptions that I might be carrying, so please be > kind to point them out. Any help, advice and suggestions will be very highly > appreciated. > My goal is to get a greater than 100 TB gluster NAS up on the cloud. Each > server will hold around 2x8TB disks. The export volume size (client disk > mount size) would be greater than 20 TB. > This is how I am planning to set it up all.. 16 servers each with 2x8=16 TB > of space. The glusterfs will be replicate and distributed (raid-10). I did > like to go with ZFS on linux for the disks. > The client machines will use the glusterfs client for mounting the volumes. > ext4 is limited to 16 TB due to userspace tool (e2fsprogs). > Would this be considered as a production ready setup? The data housed on > this cluster will is critical and hence I need to very sure before I go > ahead with this kind of a setup. > Or would using ZFS with Gluster makes more sense on FreeBSD or illuminos > (ZFS is native there). > Thanks a lot > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > >