Replies inline > back to our original problem of all-hanging glusterfs servers and > clients. > Today we got another hang with same look and feel, but this time we > got > something in the logs, please read and tell us how to further > proceed. > Configuration is as before. I send the whole log since boot, crash is > visible > at the end. We did the same testing as before, running two bonnies on > two > clients. > > Linux version 2.6.30.5 (root@linux-tnpx) (gcc version 4.3.2 > [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Tue Aug 18 > 12:06:06 CEST 2009 > general protection fault: 0000 [#1] SMP > last sysfs file: > /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map > CPU 2 > Modules linked in: fuse loop i2c_i801 i2c_core e100 e1000e > Pid: 3833, comm: glusterfsd Not tainted 2.6.30.5 #1 empty > RIP: 0010:[<ffffffff80244305>] [<ffffffff80244305>] > __wake_up_bit+0xc/0x2d > RSP: 0018:ffff88011fc51a98 EFLAGS: 00010292 > RAX: 8dfd233fe2300848 RBX: ffffe20000220058 RCX: 0000000000000040 > RDX: 0000000000000000 RSI: ffffe20000220058 RDI: 8dfd233fe2300840 > RBP: ffff8800b3be03b0 R08: b000000000000000 R09: ffffe20000220058 > R10: ffffffffb3be03b1 R11: 0000000000000001 R12: 00000000000021a4 > R13: 00000000021a4000 R14: ffff8800b3be03b0 R15: 00000000000021a4 > FS: 00007f684127f950(0000) GS:ffff880028052000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fff5373eb78 CR3: 000000011fc79000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process glusterfsd (pid: 3833, threadinfo ffff88011fc50000, task > ffff880126ad53e0) > Stack: > ffffffff8048eb40 ffffffff00000000 ffffe20000220058 ffffffff8025fa3f > ffffffff8048eb40 0000000000004000 00000000ffffffff ffffffff8025fc36 > 000000d0b3be0298 ffffffff8048eb40 0000000000004000 00000000fffffff4 > Call Trace: > [<ffffffff8025fa3f>] ? find_lock_page+0x43/0x55 > [<ffffffff8025fc36>] ? grab_cache_page_write_begin+0x3b/0xa1 > [<ffffffff802d34ef>] ? reiserfs_write_begin+0x81/0x1dc > [<ffffffff802d5505>] ? reiserfs_get_block+0x0/0xeb5 > [<ffffffff8026054a>] ? generic_file_buffered_write+0x12c/0x2fa > [<ffffffff80260bf7>] ? __generic_file_aio_write_nolock+0x349/0x37d > [<ffffffff8024d33f>] ? futex_wait+0x41a/0x42f > [<ffffffff802613ea>] ? generic_file_aio_write+0x64/0xc4 > [<ffffffff80261386>] ? generic_file_aio_write+0x0/0xc4 > [<ffffffff80286f09>] ? do_sync_readv_writev+0xc0/0x107 > [<ffffffff8024d486>] ? futex_wake+0xc8/0xd9 > [<ffffffff80244348>] ? autoremove_wake_function+0x0/0x2e > [<ffffffff8024e5e8>] ? do_futex+0xa9/0x8b3 > [<ffffffff80286d95>] ? rw_copy_check_uvector+0x6d/0xe4 > [<ffffffff80287581>] ? do_readv_writev+0xb2/0x18b > [<ffffffff80249bb7>] ? getnstimeofday+0x55/0xaf > [<ffffffff80246b62>] ? ktime_get_ts+0x21/0x49 > [<ffffffff8028775b>] ? sys_writev+0x45/0x6e > [<ffffffff8020ae6b>] ? system_call_fastpath+0x16/0x1b > Code: 00 48 29 f8 2b 8a d0 e7 5d 80 4c 01 c0 48 d3 e8 48 6b c0 18 48 > 03 82 c0 e7 5d 80 5a 5b 5d c3 48 83 ec 18 48 8d 47 08 89 54 24 08 <48> > 39 47 08 74 16 48 89 34 24 48 89 e1 ba 01 00 00 00 be 03 00 > RIP [<ffffffff80244305>] __wake_up_bit+0xc/0x2d > RSP <ffff88011fc51a98> > ---[ end trace 10a1fa47d70a1dc4 ]--- The right place to post this backtrace is reiserfs-devel@xxxxxxxxxxxxxxx. You could do them a favor by mentioning the closest pair of kernel versions in which this issue is not-seen, and then appears -- if you have the time to do that for them. For all you know it might already be fixed in a newer kernel version, but you will find the right answer in that ML. Avati