On Fri, 13 Jun 2008, Anand Avati wrote:
Indeed it did. However, when doing multiple 10 GB dd writes on my four
client/servers, 2 of my clients died (the other two finished their tasks
without issue and stayed operational). There were a large number of "frame
:" lines in glusterfs.log, and the log ended with:
frame : type(2) op(0)
frame : type(2) op(0)
2008-06-11 20:43:20 C [common-utils.c:155:gf_print_bytes] : xfer ==
142359133061, rcvd == 130379872
[0xb7fb4420]
/lib/tls/i686/cmov/libc.so.6(abort+0x101)[0xb7e4ea01]
/usr/lib/glusterfs/1.4.0qa19/xlator/mount/fuse.so[0xb758dd84]
/lib/tls/i686/cmov/libpthread.so.0[0xb7f764fb]
/lib/tls/i686/cmov/libc.so.6(clone+0x5e)[0xb7ef8e5e]
---------
did the client memory usage grow very high before it crashed? can you tell
us the size of the core dump file? also is it possible to get a gdb
backtrace on the coredump? this will help us a lot.
I didn't think to look for a core dump. The client must have become huge,
as the core dump was 3GB. Unfortunately, I didn't compile with debugging;
just in case, here's the bt, anyway:
#0 0xb7fb4410 in __kernel_vsyscall ()
#1 0xb7e4d085 in raise () from /lib/tls/i686/cmov/libc.so.6
#2 0xb7e4ea01 in abort () from /lib/tls/i686/cmov/libc.so.6
#3 0xb758dd84 in ?? () from
/usr/lib/glusterfs/1.4.0qa19/xlator/mount/fuse.so
#4 0x00000001 in ?? ()
#5 0x00020040 in ?? ()
#6 0x00021000 in ?? ()
#7 0x08056de0 in ?? ()
#8 0x00000000 in ?? ()
If there aren't enough clues, I can compile with debugging and try to
crash it again.
Thanks,
Brent