Hello.
I see an error, when I try to
find /mnt/glusterfs/ -type f -exec md5sum {} \; >/dev/null
it fails as:
md5sum: /mnt/glusterfs/2009/1/18/13/E4E4EF76/AF3A768D/F777FC5E:
Transport endpoint is not connected
Problem seems not to be related to any single file - it failed twice
already on different files, after about 60 minutes. This particular file
exists on both storage nodes, md5sums are the same. Extended attributes
are the same:
trusted.afr.g1b2=0sAAAAAAAAAAAAAAAA
trusted.afr.g2b2=0sAAAAAAAAAAAAAAAA
Backend FS is ext4, xfs was failing for some strange reason.
in client log is only this:
pending frames:
patchset: 82394d484803e02e28441bc0b988efaaff60dd94
signal received: 6
configuration details:argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 2.0.0rc8
[0xb8067400]
/lib/i686/cmov/libc.so.6(abort+0x188)[0xb7ede018]
/lib/i686/cmov/libc.so.6(__assert_fail+0xee)[0xb7ed55be]
/lib/i686/cmov/libpthread.so.0(pthread_mutex_lock+0x5d4)[0xb8013f54]
/usr/lib/libfuse.so.2[0xb75d3147]
/usr/lib/libfuse.so.2(fuse_session_process+0x26)[0xb75d4bf6]
/usr/local/lib/glusterfs/2.0.0rc8/xlator/mount/fuse.so[0xb75eef31]
/lib/i686/cmov/libpthread.so.0[0xb80124c0]
/lib/i686/cmov/libc.so.6(clone+0x5e)[0xb7f916de]
---------
There are no errors in storage server logs.
Systems:
Debian Stable,
kernel 2.6.29.1
gcc (Debian 4.3.2-1.1) 4.3.2
glusterfs 2.0.0rc8 built on Apr 20 2009 23:04:47
Repository revision: 82394d484803e02e28441bc0b988efaaff60dd94
fuse from debian 2.7.4
Configuration:
2 storage nodes (Dual Core, 2GB RAM, 3x1TB SATA)
1 client (Dual Core, 6GB RAM)
Storage config:
===========================================
volume gdisk1
type storage/posix
option directory /export/gdisk1/storage/
end-volume
volume brick1
type features/posix-locks
subvolumes gdisk1
end-volume
volume gdisk2
type storage/posix
option directory /export/gdisk2/storage/
end-volume
volume brick2
type features/posix-locks
subvolumes gdisk2
end-volume
volume gdisk3
type storage/posix
option directory /export/gdisk3/storage/
end-volume
volume brick3
type features/posix-locks
subvolumes gdisk3
end-volume
volume server
type protocol/server
option transport-type tcp
option auth.addr.brick1.allow *
option auth.addr.brick2.allow *
option auth.addr.brick3.allow *
subvolumes brick1 brick2 brick3
end-volume
Client config:
===========================
volume g1b1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.70.2
option remote-subvolume brick1
end-volume
volume g1b2
type protocol/client
option transport-type tcp
option remote-host 10.10.70.2
option remote-subvolume brick2
end-volume
volume g1b3
type protocol/client
option transport-type tcp
option remote-host 10.10.70.2
option remote-subvolume brick3
end-volume
volume g2b1
type protocol/client
option transport-type tcp/client
option remote-host 10.10.70.3
option remote-subvolume brick1
end-volume
volume g2b2
type protocol/client
option transport-type tcp
option remote-host 10.10.70.3
option remote-subvolume brick2
end-volume
volume g2b3
type protocol/client
option transport-type tcp
option remote-host 10.10.70.3
option remote-subvolume brick3
end-volume
volume replicate1
type cluster/replicate
subvolumes g1b1 g2b1
end-volume
volume replicate2
type cluster/replicate
subvolumes g1b2 g2b2
end-volume
volume replicate3
type cluster/replicate
subvolumes g1b3 g2b3
end-volume
volume distribute
type cluster/distribute
subvolumes replicate1 replicate2 replicate3
end-volume
volume readahead
type performance/read-ahead
option page-size 128kB # 256KB is the default option
option page-count 4 # 2 is default option
option force-atime-update off # default is off
subvolumes distribute
end-volume
volume io-cache
type performance/io-cache
option cache-size 64MB # default is 32MB
option page-size 1MB #128KB is default option
option cache-timeout 2 # default is 1 second
subvolumes readahead
end-volume
volume writebehind
type performance/write-behind
option aggregate-size 128KB # default is 0bytes
option window-size 4MB # default is equal to aggregate-size
option flush-behind on # default is 'off'
subvolumes io-cache
end-volume
Is there anything I can do help debugging it?
Also it would seem that there are some significant changes to
write-behind translator, as I understand it aggregate-size is no longer
used?
Thanks.
JV