As someone else pointed out, it is possible to run diskless workstations with their root on the GFS. I haven't tried this configuration, so I don't know what issues their may be. The security issue is there. Since they are all running from the same disk, a compromise on one can corrupt the entire cluster. On my systems, I just have a small hard drive to hold the OS and applications and then mount the GFS as a data partition. Bowie Greg Perry wrote: > Also, after reviewing the GFS architecture it seems there would be > significant security issues to consider, ie if one client/member of > the GFS volume were compromised, that would lead to a full compromise > of the filesystem across all nodes (and the ability to create special > devices and modify the filesystem on any other GFS node member). Are > there any plans to include any form of discretionary or mandatory > access controls for GFS in the upcoming v2 release? > > Greg > > Greg Perry wrote: > > Thanks Bowie, I understand more now. So within this architecture, > > it would make more sense to utilize a RAID-5/10 SAN, then add > > diskless workstations as needed for performance...? > > > > For said diskless workstations, does it make sense to run Stateless > > Linux to keep the images the same across all of the > > workstations/client machines? > > > > Regards > > > > Greg > > > > Bowie Bailey wrote: > > > Greg Perry wrote: > > > > I have been researching GFS for a few days, and I have some > > > > questions that hopefully some seasoned users of GFS may be able > > > > to answer. > > > > > > > > I am working on the design of a linux cluster that needs to be > > > > scalable, it will be primarily an RDBMS-driven data warehouse > > > > used for data mining and content indexing. In an ideal world, > > > > we would be able to start with a small (say 4 node) cluster, > > > > then add machines (and storage) as the various RDBMS' grow in > > > > size (as well as the use virtual IPs for load balancing across > > > > multiple lighttpd instances. All machines on the node need to > > > > be able to talk to the same volume of information, and GFS (in > > > > theory at least) would be used to aggregate the drives from > > > > each machine into that huge shared logical volume). With that > > > > being said, here are some questions: > > > > > > > > 1) What is the preference on the RDBMS, will MySQL 5.x work and > > > > are there any locking issues to consider? What would the best > > > > open source RDBMS be (MySQL vs. Postgresql etc) > > > > > > Someone more qualified than me will have to answer that question. > > > > > > > 2) If there was a 10 machine cluster, each with a 300GB SATA > > > > drive, can you use GFS to aggregate all 10 drives into one big > > > > logical 3000GB volume? Would that scenario work similar to a > > > > RAID array? If one or two nodes fail, but the GFS quorum is > > > > maintained, can those nodes be replaced and repopulated just > > > > like a RAID-5 array? If this scenario is possible, how > > > > difficult is it to "grow" the shared logical volume by adding > > > > additional nodes (say I had two more machines each with a 300GB > > > > SATA drive)? > > > > > > GFS doesn't work that way. GFS is just a fancy filesystem. It > > > takes an already shared volume and allows all of the nodes to > > > access it at the same time. > > > > > > > 3) How stable is GFS currently, and is it used in many > > > > production environments? > > > > > > It seems to be stable for me, but we are still in testing mode at > > > the moment. > > > > > > > 4) How stable is the FC5 version, and does it include all of the > > > > configuration utilities in the RH Enterprise Cluster version? > > > > (the idea would be to prove the point on FC5, then migrate to RH > > > > Enterprise). > > > > > > Haven't used that one. > > > > > > > 5) Would CentOS be preferred over FC5 for the initial > > > > proof of concept and early adoption? > > > > > > If your eventual platform is RHEL, then CentOS would make more > > > sense for a testing platform since it is almost identical to > > > RHEL. Fedora can be less stable and may introduce some issues > > > that you wouldn't have with RHEL. On the other hand, RHEL may > > > have some problems that don't appear on Fedora because of updated > > > packages. > > > > > > If you want bleeding edge, use Fedora. > > > If you want stability, use CentOS or RHEL. > > > > > > > 6) Are there any restrictions or performance advantages of > > > > using all drives with the same geometry, or can you mix and > > > > match different size drives and just add to the aggregate > > > > volume size? > > > > > > As I said earlier, GFS does not do the aggregation. > > > > > > What you get with GFS is the ability to share an already networked > > > storage volume. You can use iSCSI, AoE, GNBD, or others to > > > connect the storage to all of the cluster nodes. Then you format > > > the volume with GFS so that it can be used with all of the nodes. > > > > > > I believe there is a project for the aggregate filesystem that > > > you are looking for, but as far as I know, it is still beta. > > > > > > > -- > > > > Linux-cluster@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster