Hi there, I would like to give an update to my ticket. That will maybe help people who've got similar trouble : I resolved my problem by doing: - gfs_tool shrink /home (supposed to reclaim but did not) - gfs_tool reclaim /home (still not enough ) unmount the /home on all my nodes -gfs_fsck -y /dev/etherd/e0.0 -remount my /home VICTORY = i lost all the broken inode files ADVICE= avoid 1 million file folder in the future Cheers, Oly On Wed, 2006-03-01 at 14:02 +1100, oly wrote: > Hi there > I've got a 4nodes RHEL4 cluster with GFS version 6.1.0 (built > Jun 7 > 2005 12:46:04). > The shared disk is a NAS detected by aoe as /dev/etherd/e0.0. > ANd i have problem on few files on teh file system : if i tried > to > modify the inodes o this files (delete the file, or unlink the > inode) > the cluster nodes where i launch the command lost the GFS and > the GFS > modules stay busy and cannot be remove from the kernel. my nodes > is so > stuck and the only solution is only to hardware restart this > nodes. > All the GFS journal seems to work fine ...i can even get stat > of the > DEAD file. > Is GFS got problem to manipulate file in a 'more than 1 million > files' > folder ? > IS anyone got a solution to remove this dead files or delete > teh fodler > that content all these dead files ? > Is a gfs.fsck can resolv my problem ? > Is there any later version that fix this problem ? > > Thanks in advance. > PS = see below all the details > > The error i get when i try to unlink the file inode: > ===========ERROR============ > GFS: fsid=entcluster:sataide.2: fatal: filesystem consistency > error > GFS: fsid=entcluster:sataide.2: inode = 8516674/8516674 > GFS: fsid=entcluster:sataide.2: function = gfs_change_nlink > GFS: fsid=entcluster:sataide.2: file > = /usr/src/build/574067-i686/BUILD/smp/src/gfs/inode.c, line = > 843 > GFS: fsid=entcluster:sataide.2: time = 1141080134 > GFS: fsid=entcluster:sataide.2: about to withdraw from the > cluster > GFS: fsid=entcluster:sataide.2: waiting for outstanding I/O > GFS: fsid=entcluster:sataide.2: telling LM to withdraw > lock_dlm: withdraw abandoned memory > GFS: fsid=entcluster:sataide.2: withdrawn > mh_magic = 0x01161970 > mh_type = 4 > mh_generation = 68 > mh_format = 400 > mh_incarn = 6 > no_formal_ino = 8516674 > no_addr = 8516674 > di_mode = 0664 > di_uid = 500 > di_gid = 500 > di_nlink = 0 > di_size = 0 > di_blocks = 1 > di_atime = 1141042636 > di_mtime = 1140001370 > di_ctime = 1140001370 > di_major = 0 > di_minor = 0 > di_rgrp = 8513987 > di_goal_rgrp = 8513987 > di_goal_dblk = 2682 > di_goal_mblk = 2682 > di_flags = 0x00000004 > di_payload_format = 0 > di_type = 1 > di_height = 0 > di_incarn = 0 > di_pad = 0 > di_depth = 0 > di_entries = 0 > no_formal_ino = 0 > no_addr = 0 > di_eattr = 0 > di_reserved = > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 00 00 00 00 > ========END OF ERROR========== > > My cman status: > ==========STATUS============ > Protocol version: 5.0.1 > Config version: 4 > Cluster name: entcluster > Cluster ID: 42548 > Cluster Member: Yes > Membership state: Cluster-Member > Nodes: 4 > Expected_votes: 1 > Total_votes: 4 > Quorum: 3 > Active subsystems: 5 > Node name: XXX.domainX.tld > Node addresses: x.x.x.x > ========END CMAN========= > > My gfs_tool df : > ============DF========= > /home: > SB lock proto = "lock_dlm" > SB lock table = "entcluster:sataide" > SB ondisk format = 1309 > SB multihost format = 1401 > Block size = 4096 > Journals = 4 > Resource Groups = 274 > Mounted lock proto = "lock_dlm" > Mounted lock table = "entcluster:sataide" > Mounted host data = "" > Journal number = 0 > Lock module flags = > Local flocks = FALSE > Local caching = FALSE > Oopses OK = FALSE > > Type Total Used Free > use% > > ------------------------------------------------------------------------ > inodes 100642 100642 0 > 100% > metadata 3842538 8527 3834011 0% > data 13999476 2760327 11239149 > 20% > =============END DF ========= > Version of my modules : > ========modules======== > CMAN 2.6.9-36.0 (built May 31 2005 12:15:02) installed > DLM 2.6.9-34.0 (built Jun 2 2005 15:17:56) installed > Lock_Harness 2.6.9-35.5 (built Jun 7 2005 12:42:30) installed > GFS 2.6.9-35.5 (built Jun 7 2005 12:42:49) installed > aoe: aoe_init: AoE v2.6-11 initialised. > Lock_DLM (built Jun 7 2005 12:42:32) installed > ========end modules======== > > > > -- > Aurelien Lemaire (oly) > http://www.squiz.net > Sydney | Canberra | London > 92 Jarrett St Leichhardt, Sydney, NSW 2040 > T:+61 2 9568 6866 > F:+61 2 9568 6733 > > -- > > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster