Re: CLVMD without GFS

brem belguebli <brem.belguebli@xxxxxxxxx> · Tue, 21 Jul 2009 13:16:19 +0200

Hi Chrissie,

Indeed, by default when creating the VG, it is clustered, thus when creating the LV it is active on all nodes.

To avoid data corruption, I have re created the VG as non clustered (vgcreate -c n vgXX) then created the LV which got activated only on the node where it got created.

Then changed the VG to clustered (vgchange -c y VGXX) and activated it exclusively on this node.

But, I could reproduce the behaviour of bypassing the exclusive flag:

On node B, re change the VG to non clustered though it is activated exclusively on node A.

and then activate it on node B and it works.

The thing I'm trying to point is that simply by erasing the clustered flag you can bypass the exclusive activation.

I think a barrier is necessary to prevent this to happen, removing the clustered flag from a VG should be possible only if the node holding the VG exclusively is down (does the lock manager  DLM report which node holds exclusively a VG ?)

Thanks

Brem

2009/7/21, Christine Caulfield <ccaulfie@xxxxxxxxxx>:
Hiya,

I've just tried this on my cluster and it works fine.

What you need to remember is that lvcreate on one node will also activate the LV on all nodes in the cluster - it does an implicit lvchange -ay when you create it. What I can't explain is why vgchange -ae seemed to work fine on node A, it should give the same error as on node B because LVs are open shared on both nodes.

Its not clear to me when you tagged the VG as clustered, so that might be contributing to the problem. When I create a new VG on shared storage it automatically gets labelled clustered so I have never needed to do this explicitly. If you create a non-clustered VG you probably ought to deactivate it on all nodes first as it could mess up the locking otherwise. This *might* be the cause of your troubles.

The error on clvmd startup can be ignored. It's caused by clvmd ussing a background command with --no_locking so that it can check which LVs (if any) are already active and re-acquire locks for them

Sorry this isn't conclusive, The exact order in which things are happening is not clear to me.

Chrissie. 

On 07/21/2009 10:21 AM, brem belguebli wrote:

Hi all,
I think there is something to clarify about using CLVM across a cluster
in a active/passive mode without GFS.
 From my understanding, CLVM keeps LVM metadata coherent among the

cluster nodes and provides a cluster wide locking mechanism that can
prevent any node from trying to activate a volume group if it has been
activated exclusively (vgchange -a e VGXXX)  by another node (which
needs to be up).

I have been playing with it to check this behaviour but it doesn't seem
to make what is expected.
I have 2 nodes (RHEL 5.3 X86_64, cluster installed and configured) , A
and B using a SAN shared storage.
I  have a LUN from this SAN seen by both nodes, pvcreate'd

/dev/mpath/mpath0 , vgcreate'd vg10 and lvcreate'd lvol1 (on one node),
created an ext3 FS on /dev/vg10/lvol1
CLVM is running in debug mode (clvmd -d2 ) (but it complains about
locking disabled though locking set to 3 on both nodes)

On node A:
          vgchange -c y vg10 returns OK (vgs -->  vg10     1   1   0
wz--nc)
          vgchange -a e --> OK
          lvs returns lvol1   vg10   -wi-a-
On node B (while things are active on A, A is UP and member of the

cluster ):
          vgchange -a e --> Error locking on node B: Volume is busy on
another node
                                   1 logical volume(s) in volume group
"vg10" now active
It activates vg10 even if it sees it busy on another node .

on B, lvs returns lvol1   vg10   -wi-a-
as well as on A.
I think the main problem comes from the fact that, as it is said when
starting CLVM in debug mode,  WARNING: Locking disabled. Be careful!
This could corrupt your metadata.

IMHO, the algorithm should be as follows:
VG is tagged as clustered (vgchange -c y VGXXX)
if a node (node P) tries to activate the VG exclusively (vgchange -a VGXXX)
ask the lock manager to check if VG is not already locked by another

node (node X)
if so, check if node X is up
if node X is down, return OK to node P
else
return NOK to node P (explicitely that VG is held exclusively by node X)
Brem
PS: this shouldn't be a problem with GFS or other clustered FS (OCFS,

etc...) as no node should try to activate exclusively any VG.

------------------------------------------------------------------------

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster