Mark Hlawatschek wrote:
On Tuesday 13 February 2007 20:56, Wendy Cheng wrote:
Wendy Cheng wrote:
So it is removing a file. It has obtained the directory lock and is
waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback never
occurs. Do you have abnormal messages in your /var/log/messages file ?
Dave, how to dump the locks from DLM side to see how DLM is thinking ?
shell> cman_tool services /* find your lock space name */
shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks
shell> cat /proc/cluster/dlm_locks
Then try to find the lock (2, hex of (1114343)) - cut and paste the
contents of that file here.
syslog seems to be ok.
note, that the process 5856 is running on node1
What I was looking for is a lock (type=2 and lock number=0x98d3
(=1114343)) - that's the lock hangs process id=5856. Since pid=5856 also
holds another directory exclusive lock, so no body can access to that
directory.
Apparently from GFS end, node 2 thinks 0x98d3 is "unlocked" and node 1
is waiting for it. The only thing that can get node 1 out of this wait
state is DLM's callback. If DLM doesn't have any record of this lock,
pid=5856 will wait forever. Are you sure this is the whole file of DLM
output ? This lock somehow disappears from DLM and I have no idea why we
get into this state. If the files are too large, could you tar the files
and email over ? I would like to see (both) complete glock and dlm lock
dumps on both nodes (4 files here). If possible, add the following two
outputs (so 6 files total):
shell> cd /tmp /* on both nodes */
shell> script /* this should generate a file called typescript in /tmp
directory */
shell> crash
crash> foreach bt /* keep hitting space bar until this command run thru */
crash> quit
shell> <cntl><d> /* this should close out typescript file */
shell> mv typescript nodex_crash /* x=1, 2 based on node1 or node2 */
Tar these 6 files (glock_dump_1, glock_dump_2, dlm_dump_1, dlm_dump_2,
node1_crash, node2_crash) and email them to wcheng@xxxxxxxxxx
Thank you for the helps if you can.
-- Wendy
Here's the dlm output:
node1:
Resource 0000010001218088 (parent 0000000000000000). Name (len=24) " 2
1100e7"
Local Copy, Master is node 2
Granted Queue
Conversion Queue
Waiting Queue
5eb00178 PR (EX) Master: 3eeb0117 LQ: 0,0x5
[...]
Resource 00000100f56f0618 (parent 0000000000000000). Name (len=24) " 5
1100e7"
Local Copy, Master is node 2
Granted Queue
5bc20257 PR Master: 3d9703e0
Conversion Queue
Waiting Queue
node2:
Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) " 2
1100e7"
Master Copy
Granted Queue
3eeb0117 PR Remote: 1 5eb00178
Conversion Queue
Waiting Queue
[...]
Resource 000001079f7e81d8 (parent 0000000000000000). Name (len=24) " 5
1100e7"
Master Copy
Granted Queue
3d9703e0 PR Remote: 1 5bc20257
3e500091 PR
Conversion Queue
Waiting Queue
Thanks for your help,
Mark
we have the following deadlock situation:
2 node cluster consisting of node1 and node2.
/usr/local is placed on a GFS filesystem mounted on both nodes.
Lockmanager is dlm.
We are using RHEL4u4
a strace to ls -l /usr/local/swadmin/mnx/xml ends up in
lstat("/usr/local/swadmin/mnx/xml",
This happens on both cluster nodes.
All processes trying to access the directory
/usr/local/swadmin/mnx/xml
are in "Waiting for IO (D)" state. I.e. system load is at about 400
;-)
Any ideas ?
Quickly browsing this, look to me that process with pid=5856 got stuck.
That process had the file or directory (ino number 627732 - probably
/usr/local/swadmin/mnx/xml) exclusive lock so everyone was waiting for
it. The faulty process was apparently in the middle of obtaining
another
exclusive lock (and almost got it). We need to know where pid=5856 was
stuck at that time. If this occurs again, could you use "crash" to back
trace that process and show us the output ?
Here's the crash output:
crash> bt 5856
PID: 5856 TASK: 10bd26427f0 CPU: 0 COMMAND: "java"
#0 [10bd20cfbc8] schedule at ffffffff8030a1d1
#1 [10bd20cfca0] wait_for_completion at ffffffff8030a415
#2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e
#3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce
#4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41
#5 [10bd20cfea0] vfs_unlink at ffffffff801889fa
#6 [10bd20cfed0] sys_unlink at ffffffff80188b19
#7 [10bd20cff30] filp_close at ffffffff80178e48
#8 [10bd20cff50] error_exit at ffffffff80110d91
RIP: 0000002a9593f649 RSP: 0000007fbfffbca0 RFLAGS: 00010206
RAX: 0000000000000057 RBX: ffffffff8011026a RCX: 0000002a9cc9c870
RDX: 0000002ae5989000 RSI: 0000002a962fa3a8 RDI: 0000002ae5989000
RBP: 0000000000000000 R8: 0000002a9630abb0 R9: 0000000000000ffc
R10: 0000002a9630abc0 R11: 0000000000000206 R12: 0000000040115700
R13: 0000002ae23294b0 R14: 0000007fbfffc300 R15: 0000002ae5989000
ORIG_RAX: 0000000000000057 CS: 0033 SS: 002b
a lockdump analysis with the decipher_lockstate_dump and
parse_lockdump
shows the following output (The whole file is too large for the
mailing-list):
Entries: 101939
Glocks: 60112
PIDs: 751
4 chain:
lockdump.node1.dec Glock (inode[2], 1114343)
gl_flags = lock[1]
gl_count = 5
gl_state = shared[3]
req_gh = yes
req_bh = yes
lvb_count = 0
object = yes
new_le = no
incore_le = no
reclaim = no
aspace = 1
ail_bufs = no
Request
owner = 5856
gh_state = exclusive[1]
gh_flags = try[0] local_excl[5] async[6]
error = 0
gh_iflags = promote[1]
Waiter3
owner = 5856
gh_state = exclusive[1]
gh_flags = try[0] local_excl[5] async[6]
error = 0
gh_iflags = promote[1]
Inode: busy
lockdump.node2.dec Glock (inode[2], 1114343)
gl_flags =
gl_count = 2
gl_state = unlocked[0]
req_gh = no
req_bh = no
lvb_count = 0
object = yes
new_le = no
incore_le = no
reclaim = no
aspace = 0
ail_bufs = no
Inode:
num = 1114343/1114343
type = regular[1]
i_count = 1
i_flags =
vnode = yes
lockdump.node1.dec Glock (inode[2], 627732)
gl_flags = dirty[5]
gl_count = 379
gl_state = exclusive[1]
req_gh = no
req_bh = no
lvb_count = 0
object = yes
new_le = no
incore_le = no
reclaim = no
aspace = 58
ail_bufs = no
Holder
owner = 5856
gh_state = exclusive[1]
gh_flags = try[0] local_excl[5] async[6]
error = 0
gh_iflags = promote[1] holder[6] first[7]
Waiter2
owner = none[-1]
gh_state = shared[3]
gh_flags = try[0]
error = 0
gh_iflags = demote[2] alloced[4] dealloc[5]
Waiter3
owner = 32753
gh_state = shared[3]
gh_flags = any[3]
error = 0
gh_iflags = promote[1]
[...loads of Waiter3 entries...]
Waiter3
owner = 4566
gh_state = shared[3]
gh_flags = any[3]
error = 0
gh_iflags = promote[1]
Inode: busy
lockdump.node2.dec Glock (inode[2], 627732)
gl_flags = lock[1]
gl_count = 375
gl_state = unlocked[0]
req_gh = yes
req_bh = yes
lvb_count = 0
object = yes
new_le = no
incore_le = no
reclaim = no
aspace = 0
ail_bufs = no
Request
owner = 20187
gh_state = shared[3]
gh_flags = any[3]
error = 0
gh_iflags = promote[1]
Waiter3
owner = 20187
gh_state = shared[3]
gh_flags = any[3]
error = 0
gh_iflags = promote[1]
[...loads of Waiter3 entries...]
Waiter3
owner = 10460
gh_state = shared[3]
gh_flags = any[3]
error = 0
gh_iflags = promote[1]
Inode: busy
2 requests
--
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster