On Thu, 2005-06-09 at 19:04 -0800, Jay Cable wrote: Hi, can you put this in bugzilla? Oopses are always bugs. -- Lon > I am using the "RHEL4 cluster" branch from cvs, and the 2.6.9-5.0.5.ELsmp > kernel. I am using "lock_dlm" locking and the file system was created > via: > gfs_mkfs -r 1536 -j 3 -p lock_dlm -t ftp:dds_space /dev/mapper/ftp_space-erc1 > My cluster configuration is pretty simple - sanbox2 fencing with two nodes > and the two nodes option set (<cman two_node="1" expected_votes="1">). > > I would greatly appreciate any advice folks have as to what I can do to > fix this problem. For the list archives it appears that other folks are > serving out gfs filesystems via nfs, so this should be possible, right? > > I have attached the relevant part of /var/log/messages > for a crash. If any additional information would be helpful, please let > me know, and I will get it ( the crashes/hangs are very repeatable!). > > Thanks, > -Jay Cable > > Here is the output from one of the crashes: > Jun 9 19:23:46 jin kernel: send_arp uses obsolete (PF_INET,SOCK_PACKET) > Jun 9 19:28:06 jin kernel: Bad page state at prep_new_page (in process > 'nfsd', page c159f4e0) > Jun 9 19:28:06 jin kernel: flags:0x20001020 mapping:f6a300e0 mapcount:0 > count:2 > Jun 9 19:28:06 jin kernel: Backtrace: > Jun 9 19:28:06 jin kernel: [<c013e669>] bad_page+0x58/0x89 > Jun 9 19:28:06 jin kernel: [<c013e9ec>] prep_new_page+0x24/0x3a > Jun 9 19:28:06 jin kernel: [<c013eef8>] buffered_rmqueue+0x17d/0x1a5 > Jun 9 19:28:06 jin kernel: [<c013efd4>] __alloc_pages+0xb4/0x298 > Jun 9 19:28:06 jin kernel: [<c013baa2>] find_lock_page+0x96/0x9d > Jun 9 19:28:06 jin kernel: [<c013d16d>] > generic_file_buffered_write+0x10d/0x47c > Jun 9 19:28:06 jin kernel: [<c013bac1>] find_or_create_page+0x18/0x72 > Jun 9 19:28:06 jin kernel: [<c013b775>] wake_up_page+0x9/0x29 > Jun 9 19:28:06 jin kernel: [<c013d85e>] > generic_file_aio_write_nolock+0x382/0x3b0 > Jun 9 19:28:06 jin kernel: [<c013d910>] > generic_file_write_nolock+0x84/0x99 > Jun 9 19:28:06 jin kernel: [<f8f96e5f>] gfs_glock_nq+0xe3/0x116 [gfs] > Jun 9 19:28:06 jin kernel: [<c011e8d2>] > autoremove_wake_function+0x0/0x2d > Jun 9 19:28:06 jin kernel: [<f8fb7658>] gfs_trans_begin_i+0xfd/0x15a > [gfs] > Jun 9 19:28:06 jin kernel: [<f8faadd2>] do_do_write_buf+0x268/0x3b4 > [gfs] > Jun 9 19:28:06 jin kernel: [<f8fab02e>] do_write_buf+0x110/0x152 [gfs] > Jun 9 19:28:06 jin kernel: [<f8faa238>] walk_vm+0xd3/0xf7 [gfs] > Jun 9 19:28:06 jin kernel: [<f8f9709a>] gfs_glock_dq+0x111/0x11f [gfs] > Jun 9 19:28:06 jin kernel: [<f8fab10d>] gfs_write+0x9d/0xb6 [gfs] > Jun 9 19:28:06 jin kernel: [<f8faaf1e>] do_write_buf+0x0/0x152 [gfs] > Jun 9 19:28:06 jin kernel: [<f8fab070>] gfs_write+0x0/0xb6 [gfs] > Jun 9 19:28:06 jin kernel: [<c0155ba8>] do_readv_writev+0x1c5/0x21d > Jun 9 19:28:06 jin kernel: [<c0154c92>] dentry_open+0xf0/0x1a5 > Jun 9 19:28:06 jin kernel: [<c0155c7e>] vfs_writev+0x3e/0x43 > Jun 9 19:28:06 jin kernel: [<f8c11b6b>] nfsd_write+0xeb/0x289 [nfsd] > Jun 9 19:28:06 jin kernel: [<f8b2d5db>] svcauth_unix_accept+0x2d3/0x34a > [sunrpc] > Jun 9 19:28:06 jin kernel: [<f8c18356>] nfsd3_proc_write+0xbf/0xd5 > [nfsd] > Jun 9 19:28:06 jin kernel: [<f8c1a3a8>] > nfs3svc_decode_writeargs+0x0/0x243 [nfsd] > Jun 9 19:28:06 jin kernel: [<f8c0e5d7>] nfsd_dispatch+0xba/0x16f [nfsd] > Jun 9 19:28:06 jin kernel: [<f8b2a446>] svc_process+0x420/0x6d6 [sunrpc] > Jun 9 19:28:06 jin kernel: [<f8c0e3b7>] nfsd+0x1cc/0x332 [nfsd] > Jun 9 19:28:06 jin kernel: [<f8c0e1eb>] nfsd+0x0/0x332 [nfsd] > Jun 9 19:28:06 jin kernel: [<c01041f1>] kernel_thread_helper+0x5/0xb > Jun 9 19:28:06 jin kernel: Trying to fix it up, but a reboot is needed > Jun 9 19:30:34 jin kernel: ------------[ cut here ]------------ > Jun 9 19:30:34 jin kernel: kernel BUG at mm/vmscan.c:377! > Jun 9 19:30:34 jin kernel: invalid operand: 0000 [#1] > Jun 9 19:30:34 jin kernel: SMP > Jun 9 19:30:34 jin kernel: Modules linked in: lock_dlm(U) dlm(U) cman(U) > gfs(U) lock_harness(U) dm_mod qla2300 qla2xxx scsi_transport_fc nfsd > exportfs lockd autofs4 i2c_dev i2c_core md5 ipv6 sunrpc ipt_REJECT > ipt_state ip_conntrack iptable_filter ip_tables button battery ac uhci_hcd > ehci_hcd e1000 floppy ext3 jbd raid1 ata_piix libata sd_mod scsi_mod > Jun 9 19:30:34 jin kernel: CPU: 1 > Jun 9 19:30:34 jin kernel: EIP: 0060:[<c01447bd>] Tainted: GF B > VLI > Jun 9 19:30:34 jin kernel: EFLAGS: 00010202 (2.6.9-5.0.5.ELsmp) > Jun 9 19:30:34 jin kernel: EIP is at shrink_list+0xa9/0x3ee > Jun 9 19:30:34 jin kernel: eax: 20001049 ebx: f7cedecc ecx: c159f4f8 > edx: c10f24d8 > Jun 9 19:30:34 jin kernel: esi: c159f4e0 edi: 00000021 ebp: f7cedf58 > esp: f7cede54 > Jun 9 19:30:34 jin kernel: ds: 007b es: 007b ss: 0068 > Jun 9 19:30:34 jin kernel: Process kswapd0 (pid: 44, threadinfo=f7ced000 > task=f7d1b7b0) > Jun 9 19:30:34 jin kernel: Stack: 00000001 00000000 00000000 00000000 > f7cedecc f7cede68 f7cede68 00000000 > Jun 9 19:30:34 jin kernel: 00000001 c12f4be0 c1204a00 00000246 > f7ceded4 c0319e00 00000000 f7ceded4 > Jun 9 19:30:34 jin kernel: c0143bc0 c10639f8 00000296 c1f479c0 > c10639e0 00000000 00000020 f7ced000 > Jun 9 19:30:34 jin kernel: Call Trace: > Jun 9 19:30:34 jin kernel: [<c0143bc0>] __pagevec_release+0x15/0x1d > Jun 9 19:30:34 jin kernel: [<c0144cdf>] shrink_cache+0x1dd/0x34d > Jun 9 19:30:34 jin kernel: [<c014539d>] shrink_zone+0xa7/0xb6 > Jun 9 19:30:34 jin kernel: [<c0145740>] balance_pgdat+0x1b6/0x2f8 > Jun 9 19:30:34 jin kernel: [<c014594c>] kswapd+0xca/0xcc > Jun 9 19:30:34 jin kernel: [<c011e8d2>] > autoremove_wake_function+0x0/0x2d > Jun 9 19:30:34 jin kernel: [<c02c6206>] ret_from_fork+0x6/0x14 > Jun 9 19:30:34 jin kernel: [<c011e8d2>] > autoremove_wake_function+0x0/0x2d > Jun 9 19:30:34 jin kernel: [<c0145882>] kswapd+0x0/0xcc > Jun 9 19:30:34 jin kernel: [<c01041f1>] kernel_thread_helper+0x5/0xb > Jun 9 19:30:34 jin kernel: Code: 71 e8 89 50 04 89 02 c7 41 04 00 02 20 > 00 c7 01 00 01 10 00 f0 0f ba 69 e8 00 19 c0 85 c0 0f 85 b8 02 00 00 8b 41 > e8 a8 40 74 08 <0f> 0b 79 01 41 9a 2d c0 8b 41 e8 f6 c4 20 0f 85 96 02 00 > 00 8b -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster