Hallo,
i have tried the new version 3 of glusterfs with xen.
we have 2 gfsserver and 2 xen server. we use client-side-replication
with domUs.
i need to know if our configuration is good to use with xen domUs:
server:
# export-domU-images-server_repl
# gfs-01-01 /GFS/domU-images
# gfs-01-02 /GFS/domU-images
volume posix
type storage/posix
option directory /GFS/domU-images
end-volume
volume locks
type features/locks
subvolumes posix
end-volume
volume domU-images
type performance/io-threads
option thread-count 16 # default is 16
subvolumes locks
end-volume
volume server
type protocol/server
option transport-type tcp
option auth.addr.domU-images.allow 192.168.11.*,127.0.0.1
option transport.socket.listen-port 6997
subvolumes domU-images
end-volume
and client:
volume gfs-01-01
type protocol/client
option transport-type tcp
option remote-host gfs-01-01
option transport.socket.nodelay on
option remote-port 6997
option remote-subvolume domU-images
option ping-timeout 7
end-volume
volume gfs-01-02
type protocol/client
option transport-type tcp
option remote-host gfs-01-02
option transport.socket.nodelay on
option remote-port 6997
option remote-subvolume domU-images
option ping-timeout 7
end-volume
volume gfs-replicate
type cluster/replicate
subvolumes gfs-01-01 gfs-01-02
end-volume
volume writebehind
type performance/write-behind
option cache-size 16MB
subvolumes gfs-replicate
end-volume
volume readahead
type performance/read-ahead
option page-count 16 # cache per file = (page-count x page-size)
subvolumes writebehind
end-volume
volume iocache
type performance/io-cache
option cache-size 1GB
option cache-timeout 1
subvolumes readahead
end-volume
i start a domU and simulate an crash on gfs-01-02 (rcnetwork stop; sleep
150; rcnetwork start). domU runs further without any problems.
Client Log:
[2009-12-10 16:34:16] E
[client-protocol.c:415:client_ping_timer_expired] gfs-01-02: Server
xxx.xxx.xxx.xxx:6997 has not responded in the last 7 seconds, disconnecting.
[2009-12-10 16:34:16] E [saved-frames.c:165:saved_frames_unwind]
gfs-01-02: forced unwinding frame type(1) op(GETXATTR)
[2009-12-10 16:34:16] E [saved-frames.c:165:saved_frames_unwind]
gfs-01-02: forced unwinding frame type(2) op(PING)
[2009-12-10 16:34:16] N [client-protocol.c:6972:notify] gfs-01-02:
disconnected
[2009-12-10 16:34:38] E [socket.c:760:socket_connect_finish] gfs-01-02:
connection to xxx.xxx.xxx.xxx:6997 failed (No route to host)
[2009-12-10 16:34:38] E [socket.c:760:socket_connect_finish] gfs-01-02:
connection to xxx.xxx.xxx.xxx:6997 failed (No route to host)
network on gfs-01-02 started
Client Log:
[2009-12-10 16:35:15] N [client-protocol.c:6224:client_setvolume_cbk]
gfs-01-02: Connected to xxx.xxx.xxx.xxx:6997, attached to remote volume
'domU-images'.
[2009-12-10 16:35:18] N [client-protocol.c:6224:client_setvolume_cbk]
gfs-01-02: Connected to xxx.xxx.xxx.xxx:6997, attached to remote volume
'domU-images'.
[2009-12-10 16:35:20] E
[afr-self-heal-common.c:1186:sh_missing_entries_create] gfs-replicate:
no missing files - /vm_disks/virt-template. proceeding to metadata check
everything looks good - sync is started from gfs-01-01 to gfs-01-02 but
the whole image were transfered.
if we do a ls -la in domU while the transfer the prompt in domU
disappears. After the sync is ready the prompt in domU appears and we
can work further.
my question: is this a normal behavoir? in
http://ftp.gluster.com/pub/gluster/glusterfs/3.0/LATEST/GlusterFS-3.0.0-Release-Notes.pdf
we have read:
2.1) Choice of self-heal algorithms
During self-heal of file contents, GlusterFS will now dynamically choose
between two
algorithms based on file size:
a) "Full" algorithm – this algorithm copies the entire file data in
order to heal the out-ofsync
copy. This algorithm is used when a file has to be created from scratch on a
server.
b) "Diff" algorithm – this algorithm compares blocks present on both
servers and copies
only those blocks that are different from the correct copy to the
out-of-sync copy. This
algorithm is used when files have to be re-built partially.
The “Diff” algorithm is especially beneficial for situations such as
running VM images,
where self-heal of a recovering replicated copy of the image will occur
much faster because
only the changed blocks need to be synchronized.
can we change the self-heal algorithmus in config file?
Thank you very much
Roland Fischer