Re: Clustering Tutorial

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks.  I should have mentioned that we're doing high performance clustering, and not HA.  We have a beowulf cluster (old and decrepid) and an OSCAR cluster.  None of our current clusters are RH, but that will probably change once we get our next 4-opteron cpu/box cluster... 

Yeehaw!

And a Big Thanks to everyone who responded.  I now have some good resources.  A lot of reading... yaaaawn !  heh-heh.

dave

On 10/20/05, Tim Spaulding <tspauld98@xxxxxxxxx> wrote:
Just a note of caution, there's a big difference between High Availability Clustering and High
Performance Clustering.  AFAIK, Beowulf is an HPC technology.  RHCS (Red Hat Cluster Suite) and
GFS (Global File System) are HAC technologies.  Some of the underlying building blocks are used by
both communities but they are used for fundamentally difference purposes.

http://www.linux-ha.org is the home of another HAC, linux-based technology.  They have more
documentation on clustering and its concepts.  Red Hat does a good job on the HOW-TOs of getting a
cluster working but a terrible job of telling folks the WHY-TOs of clustering.

I'm currently working on a comparison of linux-ha and RHCS so if you have questions regarding HAC
on linux then fire away.  If you have a beowulf cluster, je ne comprends pas, sorry.

--tims

--- Michael Will <mwill@xxxxxxxxxxxxxxxxxxxx> wrote:

> http://www.phy.duke.edu/resources/computing/brahma/Resources/beowulf_book.php
> is a good start,
> http://www.beowulf.org is another good place, it is also the home of the
> original beowulf mailinglist.
>
> Generally I would recommend digging through recent mailinglist postings
> because
> there are often very informed answers to questions.
>
> Lon just answered a fencing question a few days ago:
>
> "STONITH, STOMITH, etc. are indeed implementations of I/O fencing.
>
> Fencing is the act of forcefully preventing a node from being able to
> access resources after that node has been evicted from the cluster in an
> attempt to avoid corruption.
>
> The canonical example of when it is needed is the live-hang scenario, as
> you described:
>
> 1. node A hangs with I/Os pending to a shared file system
> 2. node B and node C decide that node A is dead and recover resources
> allocated on node A (including the shared file system)
> 3. node A resumes normal operation
> 4. node A completes I/Os to shared file system
>
> At this point, the shared file system is probably corrupt.  If you're
> lucky, fsck will fix it -- if you're not, you'll need to restore from
> backup.  I/O fencing (STONITH, or whatever we want to call it) prevents
> the last step (step 4) from happening.
>
> How fencing is done (power cycling via external switch, SCSI
> reservations, FC zoning, integrated methods like IPMI, iLO, manual
> intervention, etc.) is unimportant - so long as whatever method is used
> can guarantee that step 4 can not complete."
>
> "GFS can use fabric-level fencing - that is, you can tell the iSCSI
> server to cut a node off, or ask the fiber-channel switch to disable a
> port.  This is in addition to "power-cycle" fencing."
>
>
> Michael
>
> --
>
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
>




__________________________________
Yahoo! Music Unlimited
Access over 1 million songs. Try it free.
http://music.yahoo.com/unlimited/

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux