Re: Fwd: High Available Transparent File System

yue <ooolinux@xxxxxxx> · Sun, 10 Apr 2011 23:08:05 +0800 (CST)

What is fencing?
Fencing is the act of forecefully removing a node from a cluster. A node with OCFS2 mounted will fence itself when it realizes that it doesn't have quorum in a degraded cluster. It does this so that other nodes won't get stuck trying to access its resources. Currently OCFS2 will panic the machine when it realizes it has to fence itself off from the cluster. As described above, it will do this when it sees more nodes heartbeating than it has connectivity to and fails the quorum test.
Due to user reports of nodes hanging during fencing, OCFS2 1.2.5 no longer uses "panic" for fencing. Instead, by default, it uses "machine restart". This should not only prevent nodes from hanging during fencing but also allow for nodes to quickly restart and rejoin the cluster. While this change is internal in nature, we are documenting this so as to make users aware that they are no longer going to see the familiar panic stack trace during fencing. Instead they will see the message "*** ocfs2 is very sorry to be fencing this system by restarting ***" and that too probably only as part of the messages captured on the netdump/netconsole server.
If perchance the user wishes to use panic to fence (maybe to see the familiar oops stack trace or on the advise of customer support to diagnose frequent reboots), one can do so by issuing the following command after the O2CB cluster is online. 	# echo 1 > /proc/fs/ocfs2_nodemanager/fence_method
Please note that this change is local to a node. 

At 2011-04-10 22:29:01，"Meisam Mohammadkhani" <meisam.mohammadkhani@xxxxxxxxx> wrote:

Hi All,

I'm new to GFS. I'm searching around a solution for our enterprise application that is responsible to save(and manipulate) historical data of industrial devices. Now, we have two stations that works like hot redundant of each other. Our challenge is in case of failure. For now, our application is responsible to handling fault by synchronizing the files that changed during the fault, by itself. Our application is running on two totally independent machines (one as redundant) and so each one has its own disk.
We are searching around a solution like a "high available transparent file system" that makes the fault transparent to the application, so in case of fault, redundant machine still can access the files even the master machine is down (replica issue or such a thing).
Is there fail-over feature in GFS that satisfy our requirement? Actually, my question is that can GFS help us in our case?

Regards

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster