Hi there !
We are using gluster on an environment with multiple webservers and load
balancer, where we have only one server and multiple clients (6).
All servers are running Fedora Core 6 X86_64 with kernel
2.6.22.14-72.fc6 (with exactly same packages installed in all server).
The gluster version used is 1.3.8pre2 + 2.7.2glfs8 (both compiled
locally). The underlying FS is reiserfs mounted with the follow options
rw,noatime,nodiratime,notail. This filesystem has almost 4 thousand
files from 2k - 10Mb in size. We are using gluster to export this
filesystem for all other webservers. Below the config file used by
gluster server:
### Export volume "brick" with the contents of "/home/export" directory.
volume attachments-nl
type storage/posix # POSIX FS translator
option directory /C3Systems/data/domains/webmail.pop.com.br/attachments
end-volume
volume attachments
type features/posix-locks
subvolumes attachments-nl
option mandatory on
end-volume
### Add network serving capability to above brick.
volume server
type protocol/server
option transport-type tcp/server # For TCP/IP transport
option client-volume-filename
/C3Systems/gluster/bin/etc/glusterfs/glusterfs-client.vol
subvolumes attachments-nl attachments
option auth.ip.attachments-nl.allow * # Allow access to
"attachments-nl" volume
option auth.ip.attachments.allow * # Allow access to "attachments" volume
end-volume
The problem happen when the LB sent the post (the uploaded file) to one
webserver and than the next post goes to other webserver that try to
access the same file. When this happen, the other client got these messages:
PHP Warning:
fopen(/C3Systems/data/domains/c3systems.com.br/attachments/27gBgFQSIiOLDEo7AvxlpsFkqZw9jdnZ):
failed to open stream: File Not Found.
PHP Warning:
unlink(/C3Systems/data/domains/c3systems.com.br/attachments/5Dech7jNxjORZ2cZ9IAbR7kmgmgn2vTE):
File Not Found.
LB is using RoundRobin to distribute the load between servers.
Below, you can find the gluster configuration file used by all clients:
### file: client-volume.spec.sample
##############################################
### GlusterFS Client Volume Specification ##
##############################################
#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line.
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.
### Add client feature and attach to remote subvolume
volume client
type protocol/client
option transport-type tcp/client # for TCP/IP transport
# option ib-verbs-work-request-send-size 1048576
# option ib-verbs-work-request-send-count 16
# option ib-verbs-work-request-recv-size 1048576
# option ib-verbs-work-request-recv-count 16
# option transport-type ib-sdp/client # for Infiniband transport
# option transport-type ib-verbs/client # for ib-verbs transport
option remote-host 1.2.3.4 # IP address of the remote brick
# option remote-port 6996 # default server port is 6996
# option transport-timeout 30 # seconds to wait for a reply
# from server for each request
option remote-subvolume attachments # name of the remote volume
end-volume
### Add readahead feature
volume readahead
type performance/read-ahead
option page-size 1MB # unit in bytes
option page-count 2 # cache per file = (page-count x page-size)
subvolumes client
end-volume
### Add IO-Cache feature
volume iocache
type performance/io-cache
option page-size 256KB
option page-count 2
subvolumes readahead
end-volume
### Add writeback feature
#volume writeback
# type performance/write-behind
# option aggregate-size 1MB
# option flush-behind off
# subvolumes iocache
#end-volume
When I do the test manually, everything goes fine. What I think is
happening is that gluster isn´t having enough time to sync all clients
before clients trying to access the files (those servers are very busy
ones.....they receive millions of requests per day).
Is this configuration appropriate for this situation ? a bug ? a feature
;-) ? Is there any option like the sync used in NFS that I can use in
order guarantee that when the file is write down, all the clients already
have it ?
TIA,
Claudio Cuqui