Re: GFS = filesystem consistency error

oly <cluster@xxxxxxxxx> · Fri, 03 Mar 2006 09:13:41 +1100



Hi there,
I would like to give an update to my ticket. That will maybe help people
who've got similar trouble :
I resolved my problem by doing:
- gfs_tool shrink /home (supposed to reclaim but did not)
- gfs_tool reclaim /home (still not enough )
unmount the /home on all my nodes
-gfs_fsck -y /dev/etherd/e0.0
-remount my /home 
VICTORY = i lost all the broken inode files
ADVICE= avoid 1 million file folder in the future

Cheers, Oly


On Wed, 2006-03-01 at 14:02 +1100, oly wrote:
> Hi there
>         I've got a 4nodes RHEL4 cluster with GFS version  6.1.0 (built
>         Jun  7
>         2005 12:46:04).
>         The shared disk is a NAS detected by aoe as /dev/etherd/e0.0.
>         ANd i have problem on few files on teh file system : if i tried
>         to
>         modify the inodes o this files (delete the file, or unlink the
>         inode)
>         the cluster nodes where i launch the command lost the GFS and
>         the GFS
>         modules stay busy and cannot be remove from the kernel. my nodes
>         is so
>         stuck and the only solution is only to hardware restart this
>         nodes.
>          All the GFS journal seems to work fine ...i can even get stat
>         of the
>         DEAD file.
>          Is GFS got problem to manipulate file in a 'more than 1 million
>         files'
>         folder ?
>          IS anyone got a solution to remove this dead files or delete
>         teh fodler
>         that content all these dead files ?
>          Is a gfs.fsck can resolv my problem ?
>          Is there any later version that fix this problem ?
>         
>         Thanks in advance.
>         PS = see below all the details
>          
>         The error i get when i try to unlink the file inode:
>         ===========ERROR============
>         GFS: fsid=entcluster:sataide.2: fatal: filesystem consistency
>         error
>         GFS: fsid=entcluster:sataide.2:   inode = 8516674/8516674
>         GFS: fsid=entcluster:sataide.2:   function = gfs_change_nlink
>         GFS: fsid=entcluster:sataide.2:   file
>         = /usr/src/build/574067-i686/BUILD/smp/src/gfs/inode.c, line =
>         843
>         GFS: fsid=entcluster:sataide.2:   time = 1141080134
>         GFS: fsid=entcluster:sataide.2: about to withdraw from the
>         cluster
>         GFS: fsid=entcluster:sataide.2: waiting for outstanding I/O
>         GFS: fsid=entcluster:sataide.2: telling LM to withdraw
>         lock_dlm: withdraw abandoned memory
>         GFS: fsid=entcluster:sataide.2: withdrawn
>           mh_magic = 0x01161970
>           mh_type = 4
>           mh_generation = 68
>           mh_format = 400
>           mh_incarn = 6
>           no_formal_ino = 8516674
>           no_addr = 8516674
>           di_mode = 0664
>           di_uid = 500
>           di_gid = 500
>           di_nlink = 0
>           di_size = 0
>           di_blocks = 1
>           di_atime = 1141042636
>           di_mtime = 1140001370
>           di_ctime = 1140001370
>           di_major = 0
>           di_minor = 0
>           di_rgrp = 8513987
>           di_goal_rgrp = 8513987
>           di_goal_dblk = 2682
>           di_goal_mblk = 2682
>           di_flags = 0x00000004
>           di_payload_format = 0
>           di_type = 1
>           di_height = 0
>           di_incarn = 0
>           di_pad = 0
>           di_depth = 0
>           di_entries = 0
>           no_formal_ino = 0
>           no_addr = 0
>           di_eattr = 0
>           di_reserved =
>         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>         00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>         00 00 00 00 00 00 00 00
>         ========END OF ERROR==========
>         
>         My cman status:
>         ==========STATUS============
>         Protocol version: 5.0.1
>         Config version: 4
>         Cluster name: entcluster
>         Cluster ID: 42548
>         Cluster Member: Yes
>         Membership state: Cluster-Member
>         Nodes: 4
>         Expected_votes: 1
>         Total_votes: 4
>         Quorum: 3
>         Active subsystems: 5
>         Node name: XXX.domainX.tld
>         Node addresses: x.x.x.x
>         ========END CMAN=========
>         
>         My gfs_tool df :
>         ============DF=========
>         /home:
>           SB lock proto = "lock_dlm"
>           SB lock table = "entcluster:sataide"
>           SB ondisk format = 1309
>           SB multihost format = 1401
>           Block size = 4096
>           Journals = 4
>           Resource Groups = 274
>           Mounted lock proto = "lock_dlm"
>           Mounted lock table = "entcluster:sataide"
>           Mounted host data = ""
>           Journal number = 0
>           Lock module flags =
>           Local flocks = FALSE
>           Local caching = FALSE
>           Oopses OK = FALSE
>         
>           Type           Total          Used           Free
>         use%
>         
>         ------------------------------------------------------------------------
>           inodes         100642         100642         0
>         100%
>           metadata       3842538        8527           3834011        0%
>           data           13999476       2760327        11239149
>         20%
>         =============END DF =========
>         Version of my modules :
>         ========modules========
>         CMAN 2.6.9-36.0 (built May 31 2005 12:15:02) installed
>         DLM 2.6.9-34.0 (built Jun  2 2005 15:17:56) installed
>         Lock_Harness 2.6.9-35.5 (built Jun  7 2005 12:42:30) installed
>         GFS 2.6.9-35.5 (built Jun  7 2005 12:42:49) installed
>         aoe: aoe_init: AoE v2.6-11 initialised.
>         Lock_DLM (built Jun  7 2005 12:42:32) installed
>         ========end modules========
>         
>         
>         
>         -- 
>         Aurelien Lemaire (oly)
>         http://www.squiz.net
>         Sydney | Canberra | London
>         92 Jarrett St Leichhardt, Sydney, NSW 2040
>         T:+61 2 9568 6866 
>         F:+61 2 9568 6733    
> 
> --
> 
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster

--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster