I'm getting a lot of errors on an AFR/unify setup with 6 storage bricks
using ib-verbs and just want some help understanding what is critical.
for some reason this setup is very unstable and we want to know how to
make it as robust as the architecture suggests it should be.
The problem is that when we copy any files we get hundreds of the
following three errors in the client:
2008-03-17 12:31:00 E [fuse-bridge.c:699:fuse_fd_cbk] glusterfs-fuse:
38: /tftpboot/node_root/lib/modules/2.6.24.1/modules.symbols => -1 (5)
2008-03-17 12:31:00 E [unify.c:850:unify_open] main:
/tftpboot/node_root/lib/modules/2.6.24.1/kernel/arch/x86/kernel/cpuid.ko:
entry_count is 3
2008-03-17 12:31:00 E [unify.c:853:unify_open] main:
/tftpboot/node_root/lib/modules/2.6.24.1/kernel/arch/x86/kernel/cpuid.ko:
found on main-ns
Files still copy with these errors but very slowly.
Additionally we are unable to lose even one storage brick without the
cluster freezing.
We have the pretty common afr/unify setup with 6 storage bricks.
namespace:
Storage_01 <- AFR -> RTPST202 <-AFR-> Storage_03
storage:
Storage_01 <- AFR -> Storage_02
Storage_03 <- AFR -> Storage_04
Storage_05 <- AFR -> Storage_06
All this is running on TLA ver 703 with a the latest patched fuse module.
Any suggestions would be appreciated!
Thanks!
-Mickey Mazarick
--