we have a couple of Dell servers with Red Hat 5.2 and OpenVZ, sharing a
GFS filesystem.
We have noticed that there are a directory which processes stalls when
try to access it.
For instance look this processes:
[root@parmenides ~]# ps -fel | grep save
4 D root 8997 1 1 78 0 - 1780 339955 09:40 ?
00:02:31 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m
parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs
0 S root 16736 21208 0 78 0 - 980 pipe_w 12:07 pts/1
00:00:00 grep save
4 D root 18796 1 1 78 0 - 1777 339955 08:46 ?
00:02:16 /usr/sbin/save -s espai.upc.es -g Virtuals -LL -f - -m
parmenides -t 1236294005 -l 4 -q -W 78 -N /mnt/gfs /mnt/gfs
Both processes are stalled reading a file:
# lsof -p 8997 | grep gfs
save 8997 root cwd DIR 253,7 2048 7022183
save 8997 root 3r DIR 253,7 3864 26 /mnt/gfs
save 8997 root 6r DIR 253,7 3864 232 /mnt/gfs/vz
save 8997 root 7r DIR 253,7 3864 233 /mnt/gfs/vz/private
save 8997 root 8r DIR 253,7 3864 230761349
save 8997 root 9r DIR 253,7 3864 230773154
save 8997 root 12r DIR 253,7 2048 7003944
save 8997 root 14r DIR 253,7 3864 7022175
# lsof -p 18796 | grep gfs
save 18796 root cwd DIR 253,7 2048 7022183
save 18796 root 3r DIR 253,7 3864 26 /mnt/gfs
save 18796 root 6r DIR 253,7 3864 232 /mnt/gfs/vz
save 18796 root 7r DIR 253,7 3864 233
save 18796 root 8r DIR 253,7 3864 230761349
save 18796 root 9r DIR 253,7 3864 230773154
save 18796 root 12r DIR 253,7 2048 7003944
save 18796 root 14r DIR 253,7 3864 7022175
Also there is a process with the glock_ flag accesing the same:
0 D root 8425 6783 0 78 0 - 669 glock_ 08:24 ?
00:00:00 /usr/lib/openoffice/program/pagein
-L/usr/lib/openoffice/program @pagein-common
What can be the problem? A corruption in the filesystem?
should a "gfs_fsck" fix the problem?
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
For all your IT requirements visit: http://www.transtec.co.uk
Linux-cluster mailing list