>>> Nick Couchman 03/23/07 11:57 AM >>> I'm currently evaluating GFS2 for some clustering that I want to do, and I've run into a little problem. I'm using kernel 2.6.20.3 (with GFS2 included) and the GFS2 userspace stuff from the RH Cluster page (sourceware.org/cluster). I've tried both the "official" 2.0.0 release and the latest CVS version, and both exhibit the same behavior: when I try to make a GFS2 filesystem, mkfs.gfs2 just hangs. I'm doing this on a 1 GB iSCSI volume, and the host has already transfered 6.9GB of data. What in the world is it doing?! If I enable debug output, I get the following:
Command Line Arguments: qcsize = 1 jsize = 32 journals = 2 override = 0 proto = lock_dlm quiet = 0 rgsize = optimize for best performance table = fstest:testfs utsize = 1 device = /dev/sdb1 This will destroy any data on /dev/sdb1. It appears to contain a ext3 filesystem.
Are you sure you want to proceed? [y/n] y
Partition size = 1955808
Device Geometry: (in basic blocks) SubDevice #0: start = 0, length = 1955808, rgf_flags = 0x00000000
Device Geometry: (in FS blocks) SubDevice #0: start = 0, length = 244476, rgf_flags = 0x00000000
Device Size: 244476
Data Subdevice 0 rg sz = 256 nrgrp = 4
subdevice 0: rg_o = 17, rg_l = 61117 subdevice 0: rg_o = 61134, rg_l = 61114 subdevice 0: rg_o = 122248, rg_l = 61114 subdevice 0: rg_o = 183362, rg_l = 61114
ri_addr:: 17 ri_length:: 4 ri_data0:: 21 ri_data:: 61112 ri_bitbytes:: 15278 ri_addr:: 61134 ri_length:: 4 ri_data0:: 61138 ri_data:: 61108 ri_bitbytes:: 15277 ri_addr:: 122248 ri_length:: 4 ri_data0:: 122252 ri_data:: 61108 ri_bitbytes:: 15277 ri_addr:: 183362 ri_length:: 4 ri_data0:: 183366 ri_data:: 61108 ri_bitbytes:: 15277 Root directory: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 1 no_addr:: 21 di_mode:: 040755 di_uid:: 0 di_gid:: 0 di_nlink:: 2 di_size:: 3864 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 21 di_goal_data:: 21 di_flags:: 0x00000001 di_payload_format:: 1200 di_height:: 0 di_depth:: 0 di_entries:: 2 di_eattr:: 0 Master dir: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 2 no_addr:: 22 di_mode:: 040755 di_uid:: 0 di_gid:: 0 di_nlink:: 2 di_size:: 3864 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 22 di_goal_data:: 22 di_flags:: 0x00000201 di_payload_format:: 1200 di_height:: 0 di_depth:: 0 di_entries:: 2 di_eattr:: 0 Super Block: mh_magic:: 0x01161970 mh_type:: 1 mh_format:: 100 sb_fs_format:: 1801 sb_multihost_format:: 1900 sb_bsize:: 4096 sb_bsize_shift:: 12 no_formal_ino:: 2 no_addr:: 22 no_formal_ino:: 1 no_addr:: 21 sb_lockproto:: lock_dlm sb_locktable:: fstest:testfs Journal 0: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 4 no_addr:: 24 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 33554432 di_blocks:: 8210 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 41 di_goal_data:: 8233 di_flags:: 0x00000200 di_payload_format:: 0 di_height:: 2 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Journal 1: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 5 no_addr:: 8234 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 33554432 di_blocks:: 8210 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 8251 di_goal_data:: 16443 di_flags:: 0x00000200 di_payload_format:: 0 di_height:: 2 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Jindex: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 3 no_addr:: 23 di_mode:: 040700 di_uid:: 0 di_gid:: 0 di_nlink:: 2 di_size:: 3864 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 23 di_goal_data:: 23 di_flags:: 0x00000201 di_payload_format:: 1200 di_height:: 0 di_depth:: 0 di_entries:: 4 di_eattr:: 0 Inum Range 0: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 7 no_addr:: 16445 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 16 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16445 di_goal_data:: 16445 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 StatFS Change 0: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 8 no_addr:: 16446 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 24 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16446 di_goal_data:: 16446 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Quota Change 0: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 9 no_addr:: 16447 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 1048576 di_blocks:: 257 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16447 di_goal_data:: 16703 di_flags:: 0x00000200 di_payload_format:: 0 di_height:: 1 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Inum Range 1: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 10 no_addr:: 16704 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 16 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16704 di_goal_data:: 16704 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 StatFS Change 1: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 11 no_addr:: 16705 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 24 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16705 di_goal_data:: 16705 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Quota Change 1: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 12 no_addr:: 16706 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 1048576 di_blocks:: 257 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16706 di_goal_data:: 16962 di_flags:: 0x00000200 di_payload_format:: 0 di_height:: 1 di_depth:: 0 di_entries:: 0 di_eattr:: 0 per_node: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 6 no_addr:: 16444 di_mode:: 040700 di_uid:: 0 di_gid:: 0 di_nlink:: 2 di_size:: 3864 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16444 di_goal_data:: 16444 di_flags:: 0x00000201 di_payload_format:: 1200 di_height:: 0 di_depth:: 0 di_entries:: 8 di_eattr:: 0 Inum Inode: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 13 no_addr:: 16963 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 0 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16963 di_goal_data:: 16963 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 StatFS Inode: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 14 no_addr:: 16964 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 0 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16964 di_goal_data:: 16964 di_flags:: 0x00000201 di_payload_format:: 0 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Resource Index: mh_magic:: 0x01161970 mh_type:: 4 mh_format:: 400 no_formal_ino:: 15 no_addr:: 16965 di_mode:: 0100600 di_uid:: 0 di_gid:: 0 di_nlink:: 1 di_size:: 384 di_blocks:: 1 di_atime:: 1174670528 di_mtime:: 1174670528 di_ctime:: 1174670528 di_major:: 0 di_minor:: 0 di_goal_meta:: 16965 di_goal_data:: 16965 di_flags:: 0x00000201 di_payload_format:: 1100 di_height:: 0 di_depth:: 0 di_entries:: 0 di_eattr:: 0 Root quota: qu_limit:: 0 qu_warn:: 0 qu_value:: 1 Next Inum: 17
Statfs:
...and that's where it stops. To set things up, I compiled the sources, then did the following (as per the usage instructions): 1) Wrote configuration file - very simple, two host configuration. 1) Load modules gfs2, dlm, lock_dlm, and no_lock 2) Mount configfs in /sys/kernel/config 3) ccsd 4) cman_tool join 5) groupd 6) fenced 7) fence_tool join 8) dlm_controld 9) gfs_controld 10) I don't have clvmd, so I didn't start that, but the usage.txt file says it's optional. 11) mkfs.gfs2 -D -p lock_dlm -t fstest:testfs -j 2 /dev/sdb1
and it just hangs. There are no funny messages in either dmesg or /var/log/messages - just normal cluster operation (nodes being added when I start up things on both systems, etc. Can anyone shed any light on what might be happening here - why it's hanging up and transmitting so much data on formatting such a small volume?
Thanks, Nick
>Hi Nick,
>Since you're building it from source anyway, I recommend recompiling mkfs.gfs2
>with gdb debugging enabled. To do that, change the Makefile so that CFLAGS has -g instead of -O2. Then make; make >install.
>Then do the command again, and when it hangs, go into gdb and see where >it's hung. In other words, from another terminal session:
>cd /mkfs/source/directory/ e.g. /home/devel/cluster/gfs2/mkfs/ >ps ax | grep mkfs.gfs2 (to get the pid) >gdb ./mkfs.gfs2 <pid> >then do a "bt" to get a call stack. >Post the results here or email them directly to me.
>The bt output should hopefully tell me what's going on. >If mkfs.gfs2 is broken, open up bugzilla against me and I'll fix it.
>I've used mkfs.gfs2 many times and never had it hang.
>Regards,
>Bob Peterson >Red Hat Cluster Suite
Bob, Thanks for the quick follow-up. I gave this a shot, but here's the problem: the program seems to be hung on I/O, so it doesn't respond to any signals. When I try to run gdb against the currently running PID, gdb hangs after the following line: Attaching to program: mkfs.gfs2, process 3349
and does nothing (sometimes it doesn't even get that far). I can't use <Ctrl-C> to kill the mkfs process, nor does it respond to a kill with a signal 9. Also, during this mkfs.gfs2 process, the processor is being used quite heavily (80-100%) by the kernel [scsi_wq_1]. This tends to hang up things like login shells pretty badly, and the only way to get it back is to reboot the machine (in case it matters, this is inside a VMware virtual machine). I'm going to give a couple of other things a try (like changing the elevator so that it's nicer to other processes, hopefully) and see if I can get gdb to work, but so far no such luck.
Thanks! Nick
|