GFS2: fsid=MyCluster:gfs.1: fatal: invalid metadata block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi ,guys
I have a two-nodes GFS2 cluster based on  logic volume created by drbd block device /dev/drbd0. The two nodes' mount points of  GFS2 filesystem are exported by samba share. Then there are two clients mounting and copying data into them respectively. Hours later, one client(assume just call it clientA) has finished all tasks, while the other client(assume just call it clientB) is still copying with very slow write speed(2-3MB/s, in normal case 40-100MB/s). 
Then I doubt that the there is something wrong with gfs2 filesystem on the corresponding server node that clientB mount to, and I try to write some data into it by 
excute commad as follows:  
[root@dcs-229 ~]# dd if=/dev/zero of=./data2 bs=128k count=1000
1000+0 records in
1000+0 records out
131072000 bytes (131 MB) copied, 183.152 s, 716 kB/s
It shows the write speed is too slow,  almostly hangs up. I redo it once again, it hangs up. Then, I terminate it with 『Ctr + c』, and kernel reports error messages as
follows:
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: fatal: invalid metadata block
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1:   bh = 25 (magic number)
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Trying to acquire journal lock...
Nov 12 11:50:11 dcs-229 kernel: Pid: 12044, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1
Nov 12 11:50:11 dcs-229 kernel: Call Trace:
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044be22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa044bf75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04367d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0431505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa0430b48>] ? inode_go_lock+0x88/0xf0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f25b>] ? do_promote+0x1bb/0x330 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa042f548>] ? finish_xmote+0x178/0x410 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04303e3>] ? glock_work_func+0x133/0x1d0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffffa04302b0>] ? glock_work_func+0x0/0x1d0 [gfs2]
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0
Nov 12 11:50:11 dcs-229 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Nov 12 11:50:11 dcs-229 kernel: GFS2: fsid=MyCluster:gfs.1: jid=0: Failed
And the other node also reports error messages:
Nov 12 11:48:50 dcs-226 kernel: Pid: 13784, comm: glock_workqueue Not tainted 2.6.32-358.el6.x86_64 #1
Nov 12 11:48:50 dcs-226 kernel: Call Trace:
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478e22>] ? gfs2_lm_withdraw+0x102/0x130 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffff81096cc0>] ? wake_bit_function+0x0/0x50
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa0478f75>] ? gfs2_meta_check_ii+0x45/0x50 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa04637d9>] ? gfs2_meta_indirect_buffer+0xf9/0x100 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffff8105e203>] ? perf_event_task_sched_out+0x33/0x80
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045e505>] ? gfs2_inode_refresh+0x25/0x2c0 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: [<ffffffffa045db48>] ? inode_go_lock+0x88/0xf0 [gfs2]
Nov 12 11:48:50 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: fatal: invalid metadata block
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0:   bh = 66213 (magic number)
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0:   function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 393
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: about to withdraw this file system
Nov 12 11:48:51 dcs-226 kernel: GFS2: fsid=MyCluster:gfs.0: telling LM to unmount
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c25b>] ? do_promote+0x1bb/0x330 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045c548>] ? finish_xmote+0x178/0x410 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d3e3>] ? glock_work_func+0x133/0x1d0 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffffa045d2b0>] ? glock_work_func+0x0/0x1d0 [gfs2]
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090ac0>] ? worker_thread+0x170/0x2a0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81090950>] ? worker_thread+0x0/0x2a0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096916>] ? kthread+0x96/0xa0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff81096880>] ? kthread+0x0/0xa0
Nov 12 11:48:51 dcs-226 kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
After this, mount points has crashed. what should i do? Anyone could help me?


-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux