[Linux-cluster] [PATCH] suspend astd before dlm_dir_clear() in release_lockspace()

Daniel McNeil <daniel@xxxxxxxx> · Fri, 25 Mar 2005 15:22:38 -0800

David,

Even with your dlm-astd-wake.patch my nodes still hung in umount.
I put in code to spin doing write_trylock() in the dlm code and
print out a stack trace and break out if it spun too long.

Here's the stack trace from my added dump_stack()

 [<c010423e>] dump_stack+0x1e/0x30
 [<f8aeb64b>] write_lock_dir+0x5b/0x70 [dlm]
 [<f8aeb6a8>] dlm_dir_remove+0x48/0x140 [dlm]
 [<f8afd712>] _release_rsb+0x162/0x2e0 [dlm]
 [<f8afd8a9>] release_rsb+0x19/0x20 [dlm]
 [<f8ae84c6>] process_asts+0xe6/0x200 [dlm]
 [<f8ae8f0b>] dlm_astd+0x1db/0x210 [dlm]
 [<c013325a>] kthread+0xba/0xc0
 [<c01013c5>] kernel_thread_helper+0x5/0x10

after this the code oopsed:

Unable to handle kernel paging request at virtual address 6b6b6b6b
 printing eip:
f8aeb58e
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: lock_dlm dlm lock_nolock qla2200 qla2xxx gfs lock_harness cman dm_mod video
CPU:    1
EIP:    0060:[<f8aeb58e>]    Not tainted VLI
EFLAGS: 00010202   (2.6.11)
EIP is at search_bucket+0x1e/0x80 [dlm]
eax: 000009ec   ebx: 00000010   ecx: 6b6b6b6b   edx: cd2c4000
esi: c96717ad   edi: 0000007f   ebp: cd80fee4   esp: cd80fed0
ds: 007b   es: 007b   ss: 0068
Process dlm_astd (pid: 29107, threadinfo=cd80f000 task=f6d53040)
Stack: 6b6b6b6b e87520ac 00000010 c96717ad 0000007f cd80ff14 f8aeb6c2 e87520ac
       c96717ad 00000010 0000007f 00000000 00000001 e87520ac 00000001 c9671728
       e87520ac cd80ff40 f8afd712 e87520ac 00000001 c96717ad 00000010 f8af490a
Call Trace:
 [<c01041ff>] show_stack+0x7f/0xa0
 [<c01043b2>] show_registers+0x162/0x1e0
 [<c01045de>] die+0xfe/0x190
 [<c0115892>] do_page_fault+0x3b2/0x6f2
 [<c0103e57>] error_code+0x2b/0x30
 [<f8aeb6c2>] dlm_dir_remove+0x62/0x140 [dlm]
 [<f8afd712>] _release_rsb+0x162/0x2e0 [dlm]
 [<f8afd8a9>] release_rsb+0x19/0x20 [dlm]
 [<f8ae84c6>] process_asts+0xe6/0x200 [dlm]
 [<f8ae8f0b>] dlm_astd+0x1db/0x210 [dlm]
 [<c013325a>] kthread+0xba/0xc0
 [<c01013c5>] kernel_thread_helper+0x5/0x10


I have slab debug on, so the accessing 0x6b6b6b6b means we are 
accessing freed memory.

Looking at the code, the problem is a race condition between
dlm_astd() and release_lockspace().  dlm_astd can pull an
lkb off the ast_queue and still be processing it while the
release_lockspace() is running calls dlm_dir_clear() and
then kfree()s ls->ls_dirtbl.  When dlm_astd() calls
release_rsb() it leads to a dlm_dir_remove() which accesses
the freed ls_dirtbl which is freed.  With slab debug, this
leads a spinning write_lock() and a hung umount.  My machines
are 2 cpu systems which also might expose the race condition.

The fix is below and is fairly simple, just do the astd_suspend()
in release_lockspace() before the dlm_dir_clear() and kfree().
That way astd won't be process lkb on the astqueue will it
is being freed.

Here's the patch:

--- dlm-kernel/src/lockspace.c.orig	2005-03-24 14:37:28.000000000 -0800
+++ dlm-kernel/src/lockspace.c	2005-03-24 14:42:37.000000000 -0800
@@ -477,6 +477,14 @@ static int release_lockspace(struct dlm_
 	remove_lockspace(ls);
 
 	/*
+	 * Suspend astd before doing the dlm_dir_clear() and kfree(),
+	 * otherwise astd can be processing an ast which can call release_rsb()
+	 * and then dlm_dir_remove() which references ls_dirtbl after
+	 * it has been freed.
+	 */
+	astd_suspend();
+
+	/*
 	 * Free direntry structs.
 	 */
 
@@ -487,8 +495,6 @@ static int release_lockspace(struct dlm_
 	 * Free all lkb's on lkbtbl[] lists.
 	 */
 
-	astd_suspend();
-
 	for (i = 0; i < ls->ls_lkbtbl_size; i++) {
 		head = &ls->ls_lkbtbl[i].list;
 		while (!list_empty(head)) {


Thoughts?

Daniel