Re: CLVM/GFS2 distributed locking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pulling the cables between shared storage and foo01, foo01 gets fenced. Here is some info from foo02 about shared storage and dlm debug (lock file seems to remain locked)

root@foo02:-//data/activemq_data#ls -li
total 276
 66467 -rw-r--r-- 1 root root 33030144 Dec 30 16:32 db-1.log
 66468 -rw-r--r-- 1 root root    73728 Dec 30 16:24 db.data
 66470 -rw-r--r-- 1 root root    53344 Dec 30 16:24 db.redo
128014 -rw-r--r-- 1 root root        0 Dec 30 19:49 dummy
 66466 -rw-r--r-- 1 root root        0 Dec 30 16:23 lock
root@foo02:-//data/activemq_data#grep -A 7 -i 103a2 /debug/dlm/activemq
Resource ffff81090faf96c0 Name (len=24) "       2           103a2" 
Master Copy
Granted Queue
03d10002 PR Remote:   1 00c80001
00e00001 PR
Conversion Queue
Waiting Queue
--
Resource ffff81090faf97c0 Name (len=24) "       5           103a2" 
Master Copy
Granted Queue
03c30003 PR Remote:   1 039a0001
03550001 PR
Conversion Queue
Waiting Queue


Are there some docs for interpreting this dlm debug output?


Regards,
Stevo.

On Fri, Dec 30, 2011 at 9:23 PM, Digimer <linux@xxxxxxxxxxx> wrote:
On 12/30/2011 03:08 PM, Stevo Slavić wrote:
> Hi Digimer and Yvette,
>
> Thanks for tips! I don't doubt reliability of the technology, just want
> to make sure it is configured well.
>
> After fencing a node that held a lock on a file on shared storage, lock
> remains, and non-fenced node cannot take over the lock on that file.
> Wondering how can one check which process (from which node if possible)
> is holding a lock on a file on shared storage.
> dlm should have taken care of releasing the lock once node got fenced,
> right?
>
> Regards,
> Stevo.

After a successful fence call, DLM will clean up any locks held by the
lost node. That's why it's so critical that the fence action succeeded
(ie: test-test-test). If a node doesn't actually die in a fence, but the
cluster thinks it did, and somehow the lost node returns, the lost node
will think it's locks are still valid and modify shared storage, leading
to near-certain data corruption.

It's all perfectly safe, provided you've tested your fencing properly. :)

Yvette,

 You might be right on the 'noatime' implying 'nodiratime'... I add
both out of habit.

--
Digimer
E-Mail:              digimer@xxxxxxxxxxx
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux