Asbjørn Sannes wrote: > Robert Peterson wrote: > >> Asbjørn Sannes wrote: >> >>> Asbjørn Sannes wrote: >>> >>> >>>> I have been trying to use the STABLE branch of the cluster suite with >>>> vanilla 2.6.20 kernel, and everything seemed at first to work, my >>>> problem can be reproduced by this: >>>> >>>> mount a gfs filesystem anywhere.. >>>> do a sync, this sync will now just hang there .. >>>> >>>> If I unmount the filesystem in another terminal, the sync command will >>>> end.. >>>> >>>> .. dumping the kernel stack of sync shows that it is in >>>> __sync_inodes on >>>> __down_read, looking in the code it seems that is waiting for the >>>> s_umount semaphore (in the superblock).. >>>> >>>> Just tell me if you need any more information or if this is not the >>>> correct place for this.. >>>> >>>> >>> Here is the trace for sync (while hanging) .. >>> >>> sync D ffffffff8062eb80 0 17843 >>> 15013 (NOTLB) >>> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210 >>> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0 >>> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001 >>> Call Trace: >>> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140 >>> [<ffffffff8046046c>] __down_read+0x90/0xaa >>> [<ffffffff802407d6>] down_read+0x16/0x1a >>> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb >>> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f >>> [<ffffffff80290293>] do_sync+0x17/0x60 >>> [<ffffffff802902ea>] sys_sync+0xe/0x12 >>> [<ffffffff802098be>] system_call+0x7e/0x83 >>> >>> Greetings, >>> Asbjørn Sannes >>> >>> >> Hi Asbjørn, >> >> I'll look into this as soon as I can find the time... >> >> > Great! I tried to figure out why the s_umount semaphore was not upped by > comparing to other filesystems, but the functions seems almost identical > .. so I cheated and looked what had changed lately (from your patch): > > diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c > --- gfs-kernel/src/gfs/diaper.c 26 Jun 2006 21:53:51 -0000 1.1.2.1.4.1.2.1 > +++ gfs-kernel/src/gfs/diaper.c 2 Feb 2007 22:28:41 -0000 > @@ -50,7 +50,7 @@ static int diaper_major = 0; > static LIST_HEAD(diaper_list); > static spinlock_t diaper_lock; > static DEFINE_IDR(diaper_idr); > -kmem_cache_t *diaper_slab; > +struct kmem_cache *diaper_slab; > > /** > * diaper_open - > @@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh) > struct inode *inode; > int error; > > - mutex_lock(&real->bd_mount_mutex); > + down(&real->bd_mount_sem); > sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real); > - mutex_unlock(&real->bd_mount_mutex); > + up(&real->bd_mount_sem); > if (IS_ERR(sb)) > return PTR_ERR(sb); > > @@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh) > sb->s_op = &gfs_dummy_sops; > sb->s_fs_info = dh; > > - up_write(&sb->s_umount); > module_put(gfs_fs_type.owner); > > dh->dh_dummy_sb = sb; > @@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh) > iput(inode); > > fail: > - up_write(&sb->s_umount); > deactivate_super(sb); > return error; > } > > > > And undid those up_write ones (added them back in), which helped, I > don't know if it safe though, and maybe you could shed some lights on > why they were removed? (I didn't find any changes that would do up_write > on s_umount later.. > Actually, it didn't enjoy unmount as much .. Mvh, Asbjørn Sannes -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster