On 02/14/2011 09:23 PM, Nikola Savic wrote: >> I have an in-progress tutorial, which I would recommend as a guide only. >> If you are interested, I will send you the link off-list. >> >> As for your question; No, you can read/write to the shared storage at >> the same time without the need for iSCSI. DRBD can run in >> "Primary/Primary[/Primary]" mode. Then you layer onto this clustered LVM >> followed by GFS2. Once up, all three nodes can access and edit the same >> storage space at the same time. >> >> So you're taking advantage of all three technologies. As for mirrored >> LVM, I've not tried it yet as DRBD->cLVM->GFS2 has worked quite well for me. > > I just read about Primary/Primary configuration in DRBD's User Guide, > but would love to get link to tutorial you mentioned, especially if it > covers fancing :) When one of servers is restarted and there is delay in > data being written to DRBD, what happens when sever is back up? Is > booting stopped by DRBD until synchronization is done, or does it try to > do it in background? If it's done in background, how does > Primary/Primary mode work? > > Thanks, > Nikola Once the cluster manager (corosync in Cluster3, openais in Cluster2) stops getting messages from a node (be it hung or dead), it starts a counter. Once the counter exceeds a set threshold, the node is declared dead and a fence is called against that node. This should, when working properly, reliably prevent the node from trying to access the shared storage (ie: stop it from trying to complete a write operation). Once, and *only* if the fence was successful, the cluster will reform. Once the cluster configuration is in place, recovery of the file system can begin (ie: the journal can be replayed). Finally, normal operation can continue, albeit with one less node. This is also where the resource manager (rgmanager or pacemaker) start shuffling around any resources that were lost when the node went down. Traditionally, fencing involves rebooting the lost node, in the hopes that it will come back in a healthier state. Assuming it does come up healthy, a couple main steps must occur. First, it will rejoin the other DRBD members. These members will have a "dirty block" list in memory which will allow them to quickly bring the recovered server back into sync. During this time, you can bring that node online (ie: set it primary and start accessing it via GFS2). However, note that it can not be the sole primary device until it is fully sync'ed. Second, the cluster reforms to restore the recovered node. Once the member has successfully joined, the resource manager (again, rgmanager or pacemaker) will begin reorganizing the clustered resources as per your configuration. An important note: If the fence call fails (either because of a fault in the fence device or due to misconfiguration), the cluster will hang and *all* access to the shared storage will stop. *This is by design!* The reason is that, should the cluster falsely assume the node was dead, begin recovering the journal and then the hung node recovered and tried to complete the write, the shared filesystem would be corrupted. That is; "It is better a hung cluster than a corrupt cluster." This is why fencing is so critical. :) -- Digimer E-Mail: digimer@xxxxxxxxxxx AN!Whitepapers: http://alteeve.com Node Assassin: http://nodeassassin.org -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster