Hi again, I haven't received any answer but I keep on giving details about this issue. Finally I umount GFS filesystem in both nodes and I have done a gfs_fsck;it have fix several filesystem elements. After I have mount it, and when we try to work on previously damaged directories we get those messages: GFS: fsid=hr-pm:gfs01.0: warning: assertion "(gh->gh_flags & LM_FLAG_ANY) || !(tmp_gh->gh_flags & LM_FLAG_ANY)" failed GFS: fsid=hr-pm:gfs01.0: function = add_to_queue GFS: fsid=hr-pm:gfs01.0: file = fs/gfs/glock.c, line = 1420 GFS: fsid=hr-pm:gfs01.0: time = 1237984594 BUG: warning at fs/gfs/util.c:287/gfs_assert_warn_i() (Tainted: P ) [<f9ad7e91>] gfs_assert_warn_i+0x92/0xbd [gfs] [<f9aba680>] gfs_glock_nq+0x131/0x36f [gfs] [<f9aba8d1>] gfs_glock_nq_init+0x13/0x26 [gfs] [<f9acf378>] gfs_private_nopage+0x45/0x81 [gfs] [<c0460831>] __handle_mm_fault+0x23b/0xe08 [<c04597a2>] __do_page_cache_readahead+0x1ab/0x1cc [<c06062fe>] do_page_fault+0x2a4/0x5ad [<c060605a>] do_page_fault+0x0/0x5ad [<c0607dfb>] error_code+0x4f/0x54 [<c060007b>] __inet6_check_established+0x21f/0x394 Any ideas? Thanks. Frank > Date: Fri, 20 Mar 2009 12:20:47 +0100 > From: Frank <frank@xxxxxxxxxxxxx> > Subject: processes stalled reading gfs filesystem > To: linux-cluster@xxxxxxxxxx > Message-ID: <49C37C0F.5020308@xxxxxxxxxxxxx> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi, > we have a couple of Dell servers with Red Hat 5.2 and OpenVZ, sharing a > GFS filesystem. > > We have noticed that there are a directory which processes stalls when > try to access it. > For instance look this processes: > > [root@parmenides ~]# ps -fel | grep save > 4 D root 8997 1 1 78 0 - 1780 339955 09:40 ? > 00:02:31 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m > parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs > 0 S root 16736 21208 0 78 0 - 980 pipe_w 12:07 pts/1 > 00:00:00 grep save > 4 D root 18796 1 1 78 0 - 1777 339955 08:46 ? > 00:02:16 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m > parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs > > Both processes are stalled reading a file: > > # lsof -p 8997 | grep gfs > save 8997 root cwd DIR 253,7 2048 7022183 > /mnt/gfs/vz/private/109/usr/lib/openoffice/program > save 8997 root 3r DIR 253,7 3864 26 /mnt/gfs > save 8997 root 6r DIR 253,7 3864 232 /mnt/gfs/vz > save 8997 root 7r DIR 253,7 3864 233 > /mnt/gfs/vz/private > save 8997 root 8r DIR 253,7 3864 230761349 > /mnt/gfs/vz/private/109 > save 8997 root 9r DIR 253,7 3864 230773154 > /mnt/gfs/vz/private/109/usr > save 8997 root 12r DIR 253,7 2048 7003944 > /mnt/gfs/vz/private/109/usr/lib > save 8997 root 14r DIR 253,7 3864 7022175 > /mnt/gfs/vz/private/109/usr/lib/openoffice > > # lsof -p 18796 | grep gfs > save 18796 root cwd DIR 253,7 2048 7022183 > /mnt/gfs/vz/private/109/usr/lib/openoffice/program > save 18796 root 3r DIR 253,7 3864 26 /mnt/gfs > save 18796 root 6r DIR 253,7 3864 232 /mnt/gfs/vz > save 18796 root 7r DIR 253,7 3864 233 > /mnt/gfs/vz/private > save 18796 root 8r DIR 253,7 3864 230761349 > /mnt/gfs/vz/private/109 > save 18796 root 9r DIR 253,7 3864 230773154 > /mnt/gfs/vz/private/109/usr > save 18796 root 12r DIR 253,7 2048 7003944 > /mnt/gfs/vz/private/109/usr/lib > save 18796 root 14r DIR 253,7 3864 7022175 > /mnt/gfs/vz/private/109/usr/lib/openoffice > > Also there is a process with the glock_ flag accesing the same: > > 0 D root 8425 6783 0 78 0 - 669 glock_ 08:24 ? > 00:00:00 /usr/lib/openoffice/program/pagein > -L/usr/lib/openoffice/program @pagein-common > > What can be the problem? A corruption in the filesystem? > should a "gfs_fsck" fix the problem? > Regards. > > Frank -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d'altres continguts perillosos, i es considera que est�et. For all your IT requirements visit: http://www.transtec.co.uk -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster