Hi Wendy, I created a bugzilla for this: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228916 Thanks, Mark > > Is there a way to resolve the deadlock without rebooting a server ? > > > I don't think there is a way to work around this hang other than reboot. > Will look into this with Dave (who owns DLM code in the group). > > > On Wednesday 14 February 2007 20:41, Wendy Cheng wrote: > >> Mark Hlawatschek wrote: > >>> On Tuesday 13 February 2007 20:56, Wendy Cheng wrote: > >>>> Wendy Cheng wrote: > >>>>> So it is removing a file. It has obtained the directory lock and is > >>>>> waiting for the file lock. Look to me DLM (LM_CB_ASYNC) callback > >>>>> never occurs. Do you have abnormal messages in your /var/log/messages > >>>>> file ? Dave, how to dump the locks from DLM side to see how DLM is > >>>>> thinking ? > >>>> > >>>> shell> cman_tool services /* find your lock space name */ > >>>> shell> echo "lock-space-name-found-above" > /proc/cluster/dlm_locks > >>>> shell> cat /proc/cluster/dlm_locks > >>>> > >>>> Then try to find the lock (2, hex of (1114343)) - cut and paste the > >>>> contents of that file here. > >>> > >>> syslog seems to be ok. > >>> note, that the process 5856 is running on node1 > >> > >> What I was looking for is a lock (type=2 and lock number=0x98d3 > >> (=1114343)) - that's the lock hangs process id=5856. Since pid=5856 also > >> holds another directory exclusive lock, so no body can access to that > >> directory. > >> > >> Apparently from GFS end, node 2 thinks 0x98d3 is "unlocked" and node 1 > >> is waiting for it. The only thing that can get node 1 out of this wait > >> state is DLM's callback. If DLM doesn't have any record of this lock, > >> pid=5856 will wait forever. Are you sure this is the whole file of DLM > >> output ? This lock somehow disappears from DLM and I have no idea why we > >> get into this state. If the files are too large, could you tar the files > >> and email over ? I would like to see (both) complete glock and dlm lock > >> dumps on both nodes (4 files here). If possible, add the following two > >> outputs (so 6 files total): > >> > >> shell> cd /tmp /* on both nodes */ > >> shell> script /* this should generate a file called typescript in /tmp > >> directory */ > >> shell> crash > >> crash> foreach bt /* keep hitting space bar until this command run thru > >> */ crash> quit > >> shell> <cntl><d> /* this should close out typescript file */ > >> shell> mv typescript nodex_crash /* x=1, 2 based on node1 or node2 */ > >> > >> Tar these 6 files (glock_dump_1, glock_dump_2, dlm_dump_1, dlm_dump_2, > >> node1_crash, node2_crash) and email them to wcheng@xxxxxxxxxx > >> > >> Thank you for the helps if you can. > >> > >> -- Wendy > >> > >>> Here's the dlm output: > >>> > >>> node1: > >>> Resource 0000010001218088 (parent 0000000000000000). Name (len=24) " > >>> 2 1100e7" > >>> Local Copy, Master is node 2 > >>> Granted Queue > >>> Conversion Queue > >>> Waiting Queue > >>> 5eb00178 PR (EX) Master: 3eeb0117 LQ: 0,0x5 > >>> [...] > >>> Resource 00000100f56f0618 (parent 0000000000000000). Name (len=24) " > >>> 5 1100e7" > >>> Local Copy, Master is node 2 > >>> Granted Queue > >>> 5bc20257 PR Master: 3d9703e0 > >>> Conversion Queue > >>> Waiting Queue > >>> > >>> node2: > >>> Resource 00000107e462c8c8 (parent 0000000000000000). Name (len=24) " > >>> 2 1100e7" > >>> Master Copy > >>> Granted Queue > >>> 3eeb0117 PR Remote: 1 5eb00178 > >>> Conversion Queue > >>> Waiting Queue > >>> [...] > >>> Resource 000001079f7e81d8 (parent 0000000000000000). Name (len=24) " > >>> 5 1100e7" > >>> Master Copy > >>> Granted Queue > >>> 3d9703e0 PR Remote: 1 5bc20257 > >>> 3e500091 PR > >>> Conversion Queue > >>> Waiting Queue > >>> > >>> Thanks for your help, > >>> > >>> Mark > >>> > >>>>>>>> we have the following deadlock situation: > >>>>>>>> > >>>>>>>> 2 node cluster consisting of node1 and node2. > >>>>>>>> /usr/local is placed on a GFS filesystem mounted on both nodes. > >>>>>>>> Lockmanager is dlm. > >>>>>>>> We are using RHEL4u4 > >>>>>>>> > >>>>>>>> a strace to ls -l /usr/local/swadmin/mnx/xml ends up in > >>>>>>>> lstat("/usr/local/swadmin/mnx/xml", > >>>>>>>> > >>>>>>>> This happens on both cluster nodes. > >>>>>>>> > >>>>>>>> All processes trying to access the directory > >>>>>>>> /usr/local/swadmin/mnx/xml > >>>>>>>> are in "Waiting for IO (D)" state. I.e. system load is at about > >>>>>>>> 400 ;-) > >>>>>>>> > >>>>>>>> Any ideas ? > >>>>>>> > >>>>>>> Quickly browsing this, look to me that process with pid=5856 got > >>>>>>> stuck. That process had the file or directory (ino number 627732 - > >>>>>>> probably /usr/local/swadmin/mnx/xml) exclusive lock so everyone was > >>>>>>> waiting for it. The faulty process was apparently in the middle of > >>>>>>> obtaining another > >>>>>>> exclusive lock (and almost got it). We need to know where pid=5856 > >>>>>>> was stuck at that time. If this occurs again, could you use "crash" > >>>>>>> to back trace that process and show us the output ? > >>>>>> > >>>>>> Here's the crash output: > >>>>>> > >>>>>> crash> bt 5856 > >>>>>> PID: 5856 TASK: 10bd26427f0 CPU: 0 COMMAND: "java" > >>>>>> #0 [10bd20cfbc8] schedule at ffffffff8030a1d1 > >>>>>> #1 [10bd20cfca0] wait_for_completion at ffffffff8030a415 > >>>>>> #2 [10bd20cfd20] glock_wait_internal at ffffffffa018574e > >>>>>> #3 [10bd20cfd60] gfs_glock_nq_m at ffffffffa01860ce > >>>>>> #4 [10bd20cfda0] gfs_unlink at ffffffffa019ce41 > >>>>>> #5 [10bd20cfea0] vfs_unlink at ffffffff801889fa > >>>>>> #6 [10bd20cfed0] sys_unlink at ffffffff80188b19 > >>>>>> #7 [10bd20cff30] filp_close at ffffffff80178e48 > >>>>>> #8 [10bd20cff50] error_exit at ffffffff80110d91 > >>>>>> RIP: 0000002a9593f649 RSP: 0000007fbfffbca0 RFLAGS: 00010206 > >>>>>> RAX: 0000000000000057 RBX: ffffffff8011026a RCX: > >>>>>> 0000002a9cc9c870 RDX: 0000002ae5989000 RSI: 0000002a962fa3a8 RDI: > >>>>>> 0000002ae5989000 RBP: 0000000000000000 R8: 0000002a9630abb0 R9: > >>>>>> 0000000000000ffc R10: 0000002a9630abc0 R11: 0000000000000206 R12: > >>>>>> 0000000040115700 R13: 0000002ae23294b0 R14: 0000007fbfffc300 R15: > >>>>>> 0000002ae5989000 ORIG_RAX: 0000000000000057 CS: 0033 SS: 002b > >>>>>> > >>>>>>>> a lockdump analysis with the decipher_lockstate_dump and > >>>>>>>> parse_lockdump > >>>>>>>> shows the following output (The whole file is too large for the > >>>>>>>> mailing-list): > >>>>>>>> > >>>>>>>> Entries: 101939 > >>>>>>>> Glocks: 60112 > >>>>>>>> PIDs: 751 > >>>>>>>> > >>>>>>>> 4 chain: > >>>>>>>> lockdump.node1.dec Glock (inode[2], 1114343) > >>>>>>>> gl_flags = lock[1] > >>>>>>>> gl_count = 5 > >>>>>>>> gl_state = shared[3] > >>>>>>>> req_gh = yes > >>>>>>>> req_bh = yes > >>>>>>>> lvb_count = 0 > >>>>>>>> object = yes > >>>>>>>> new_le = no > >>>>>>>> incore_le = no > >>>>>>>> reclaim = no > >>>>>>>> aspace = 1 > >>>>>>>> ail_bufs = no > >>>>>>>> Request > >>>>>>>> owner = 5856 > >>>>>>>> gh_state = exclusive[1] > >>>>>>>> gh_flags = try[0] local_excl[5] async[6] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> Waiter3 > >>>>>>>> owner = 5856 > >>>>>>>> gh_state = exclusive[1] > >>>>>>>> gh_flags = try[0] local_excl[5] async[6] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> Inode: busy > >>>>>>>> lockdump.node2.dec Glock (inode[2], 1114343) > >>>>>>>> gl_flags = > >>>>>>>> gl_count = 2 > >>>>>>>> gl_state = unlocked[0] > >>>>>>>> req_gh = no > >>>>>>>> req_bh = no > >>>>>>>> lvb_count = 0 > >>>>>>>> object = yes > >>>>>>>> new_le = no > >>>>>>>> incore_le = no > >>>>>>>> reclaim = no > >>>>>>>> aspace = 0 > >>>>>>>> ail_bufs = no > >>>>>>>> Inode: > >>>>>>>> num = 1114343/1114343 > >>>>>>>> type = regular[1] > >>>>>>>> i_count = 1 > >>>>>>>> i_flags = > >>>>>>>> vnode = yes > >>>>>>>> lockdump.node1.dec Glock (inode[2], 627732) > >>>>>>>> gl_flags = dirty[5] > >>>>>>>> gl_count = 379 > >>>>>>>> gl_state = exclusive[1] > >>>>>>>> req_gh = no > >>>>>>>> req_bh = no > >>>>>>>> lvb_count = 0 > >>>>>>>> object = yes > >>>>>>>> new_le = no > >>>>>>>> incore_le = no > >>>>>>>> reclaim = no > >>>>>>>> aspace = 58 > >>>>>>>> ail_bufs = no > >>>>>>>> Holder > >>>>>>>> owner = 5856 > >>>>>>>> gh_state = exclusive[1] > >>>>>>>> gh_flags = try[0] local_excl[5] async[6] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] holder[6] first[7] > >>>>>>>> Waiter2 > >>>>>>>> owner = none[-1] > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = try[0] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = demote[2] alloced[4] dealloc[5] > >>>>>>>> Waiter3 > >>>>>>>> owner = 32753 > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = any[3] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> [...loads of Waiter3 entries...] > >>>>>>>> Waiter3 > >>>>>>>> owner = 4566 > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = any[3] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> Inode: busy > >>>>>>>> lockdump.node2.dec Glock (inode[2], 627732) > >>>>>>>> gl_flags = lock[1] > >>>>>>>> gl_count = 375 > >>>>>>>> gl_state = unlocked[0] > >>>>>>>> req_gh = yes > >>>>>>>> req_bh = yes > >>>>>>>> lvb_count = 0 > >>>>>>>> object = yes > >>>>>>>> new_le = no > >>>>>>>> incore_le = no > >>>>>>>> reclaim = no > >>>>>>>> aspace = 0 > >>>>>>>> ail_bufs = no > >>>>>>>> Request > >>>>>>>> owner = 20187 > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = any[3] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> Waiter3 > >>>>>>>> owner = 20187 > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = any[3] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> [...loads of Waiter3 entries...] > >>>>>>>> Waiter3 > >>>>>>>> owner = 10460 > >>>>>>>> gh_state = shared[3] > >>>>>>>> gh_flags = any[3] > >>>>>>>> error = 0 > >>>>>>>> gh_iflags = promote[1] > >>>>>>>> Inode: busy > >>>>>>>> 2 requests > >>>>>>> > >>>>>>> -- > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster@xxxxxxxxxx > >> https://www.redhat.com/mailman/listinfo/linux-cluster -- Gruss / Regards, ** Visit us at CeBIT 2007 in Hannover/Germany ** ** in Hall 5, Booth G48/2 (15.-21. of March) ** Dipl.-Ing. Mark Hlawatschek http://www.atix.de/ http://www.open-sharedroot.org/ ** ATIX - Ges. fuer Informationstechnologie und Consulting mbH Einsteinstr. 10 - 85716 Unterschleissheim - Germany -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster