Wow, it almost looked like the patch fixed the issue with using
stat-prefetch, but see below. I was almost unable to get it to crash with
du's or rm's on complex directories, as it did fairly easily before.
Also, I think it fixed a tiny anomaly that I had noticed but ignored.
Previously, even without stat-prefetch, multiple du's on a complex
directory could give slightly different total sizes (a few KB out of
many GB). Now, there is no such fluctuation at all.
I WAS able to get a crash with one machine doing du, while a different
machine did removes of the same area. The du machine is the one where the
glusterfs client died (the first du completed, the second died). The
glusterfs left a backtrace in the log but no core, perhaps because I
compiled with CFLAGS=-O3. See attached backtrace.
Stat-prefetch did at least withstand a great deal more torture than before
the patch, so it seems to be a significant improvement. Note that I
haven't tried the new patch without stat-prefetch, so it's possible that
heavy testing might be able to kill it even without stat-prefetch; I'm not
sure.
Thanks,
Brent
PS Alas, there was no effect on the NFS reexport issue.
PPS The AFR client failover works pretty well, but I notice something.
The first attempt to access the glusterfs after losing contact with a
glusterfsd is sometimes faulty (e.g., the first df may say it's not
connected or give a smaller size for the volume; trying to cat a file may
not work on the first try). The very next attempt will succeed, however.
On Tue, 1 May 2007, Anand Avati wrote:
I was wondering if you could describe patch-134 a little? I was curious as
to whether or not it could be related to the stat-prefetch or the NFS
reexport issues.
this was a bug in afr which could have triggered for anybody who used
AFR and accessed a directory. the functions forming the reply path of
a transaction use function pointers and the afr's opendir reply
callback prototype had an extra member and derefered that pointer
(which is a junk pointer). so far all of us were lucky that the
derefernced pointer happened to point to some allocated memory (though
nothing was altered or used). it is very much possible that this culd
be related to the stat-prefetch. the latest glusterfs codebase now
prints a backtrace of a segfaul in the log as well as dumps a core,
next time if you get a segfault please pass on the core and/or log.
I do not see how nfs rexport can be affected, but you never know if
this could have triggered a side effect somewhere else.
I have done only a very little check with NFS re-export. once 1.3 next
release is done i will do a more thorough check.
regards,
avati
--
ultimate_answer_t
deep_thought (void)
{
sleep (years2secs (7500000));
return 42;
}
[May 01 15:48:48] [CRITICAL/common-utils.c:215/gf_print_trace()] debug-backtrace:Got signal (11), printing backtrace
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(gf_print_trace+0x2d) [0xb7f04c8d]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[0xffffe420]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0 [0xb7f02e19]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0 [0xb7f02e19]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre3/xlator/performance/stat-prefetch.so [0xb7f0ddd4]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre3/xlator/cluster/unify.so [0xb7549371]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre3/xlator/cluster/afr.so [0xb754f398]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre3/xlator/protocol/client.so [0xb755a637]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/glusterfs/1.3.0-pre3/xlator/protocol/client.so [0xb75597d2]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(transport_notify+0x1d) [0xb7f0616d]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(sys_epoll_iteration+0xe7) [0xb7f06e17]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/usr/lib/libglusterfs.so.0(poll_iteration+0x1d) [0xb7f0621d]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x804a8ce]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:/lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xdc) [0xb7da1ebc]
[May 01 15:48:48] [CRITICAL/common-utils.c:217/gf_print_trace()] debug-backtrace:[glusterfs] [0x804a071]