Robert Peterson wrote: > Asbjørn Sannes wrote: >> Asbjørn Sannes wrote: >> >>> I have been trying to use the STABLE branch of the cluster suite with >>> vanilla 2.6.20 kernel, and everything seemed at first to work, my >>> problem can be reproduced by this: >>> >>> mount a gfs filesystem anywhere.. >>> do a sync, this sync will now just hang there .. >>> >>> If I unmount the filesystem in another terminal, the sync command will >>> end.. >>> >>> .. dumping the kernel stack of sync shows that it is in >>> __sync_inodes on >>> __down_read, looking in the code it seems that is waiting for the >>> s_umount semaphore (in the superblock).. >>> >>> Just tell me if you need any more information or if this is not the >>> correct place for this.. >>> >> Here is the trace for sync (while hanging) .. >> >> sync D ffffffff8062eb80 0 17843 >> 15013 (NOTLB) >> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210 >> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0 >> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001 >> Call Trace: >> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140 >> [<ffffffff8046046c>] __down_read+0x90/0xaa >> [<ffffffff802407d6>] down_read+0x16/0x1a >> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb >> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f >> [<ffffffff80290293>] do_sync+0x17/0x60 >> [<ffffffff802902ea>] sys_sync+0xe/0x12 >> [<ffffffff802098be>] system_call+0x7e/0x83 >> >> Greetings, >> Asbjørn Sannes >> > Hi Asbjørn, > > I'll look into this as soon as I can find the time... > Great! I tried to figure out why the s_umount semaphore was not upped by comparing to other filesystems, but the functions seems almost identical .. so I cheated and looked what had changed lately (from your patch): diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c --- gfs-kernel/src/gfs/diaper.c 26 Jun 2006 21:53:51 -0000 1.1.2.1.4.1.2.1 +++ gfs-kernel/src/gfs/diaper.c 2 Feb 2007 22:28:41 -0000 @@ -50,7 +50,7 @@ static int diaper_major = 0; static LIST_HEAD(diaper_list); static spinlock_t diaper_lock; static DEFINE_IDR(diaper_idr); -kmem_cache_t *diaper_slab; +struct kmem_cache *diaper_slab; /** * diaper_open - @@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh) struct inode *inode; int error; - mutex_lock(&real->bd_mount_mutex); + down(&real->bd_mount_sem); sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real); - mutex_unlock(&real->bd_mount_mutex); + up(&real->bd_mount_sem); if (IS_ERR(sb)) return PTR_ERR(sb); @@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh) sb->s_op = &gfs_dummy_sops; sb->s_fs_info = dh; - up_write(&sb->s_umount); module_put(gfs_fs_type.owner); dh->dh_dummy_sb = sb; @@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh) iput(inode); fail: - up_write(&sb->s_umount); deactivate_super(sb); return error; } And undid those up_write ones (added them back in), which helped, I don't know if it safe though, and maybe you could shed some lights on why they were removed? (I didn't find any changes that would do up_write on s_umount later.. > Regards, > > Bob Peterson > Red Hat Cluster Suite > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > Mvh, Asbjørn Sannes -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster