Files not available in all clients immediately

Claudio Cuqui <claudio@xxxxxxxxxxxxxxxx> · Tue, 18 Mar 2008 15:36:06 -0300

Hi there !

We are using gluster on an environment with multiple webservers and load 
balancer, where we have only one server and multiple clients (6).
All servers are running Fedora Core 6 X86_64 with kernel 
2.6.22.14-72.fc6 (with exactly same packages installed in all server). 
The gluster version used is 1.3.8pre2 + 2.7.2glfs8 (both compiled 
locally). The underlying FS is reiserfs mounted with the follow options 
rw,noatime,nodiratime,notail. This filesystem has almost 4 thousand 
files from 2k - 10Mb in size. We are using gluster to export this 
filesystem for all other webservers. Below the config file used by 
gluster server:

### Export volume "brick" with the contents of "/home/export" directory.
volume attachments-nl
 type storage/posix                   # POSIX FS translator
 option directory /C3Systems/data/domains/webmail.pop.com.br/attachments
end-volume

volume attachments
 type features/posix-locks
 subvolumes attachments-nl
 option mandatory on
end-volume

### Add network serving capability to above brick.
volume server
 type protocol/server
 option transport-type tcp/server     # For TCP/IP transport
 option client-volume-filename 
/C3Systems/gluster/bin/etc/glusterfs/glusterfs-client.vol
 subvolumes attachments-nl attachments
 option auth.ip.attachments-nl.allow * # Allow access to 
"attachments-nl" volume
 option auth.ip.attachments.allow * # Allow access to "attachments" volume
end-volume

The problem happen when the LB sent the post (the uploaded file) to one 
webserver and than the next post goes to other webserver  that try to 
access the same file. When this happen, the other client got these messages:

PHP Warning:  
fopen(/C3Systems/data/domains/c3systems.com.br/attachments/27gBgFQSIiOLDEo7AvxlpsFkqZw9jdnZ): 
failed to open stream: File Not Found.
PHP Warning:  
unlink(/C3Systems/data/domains/c3systems.com.br/attachments/5Dech7jNxjORZ2cZ9IAbR7kmgmgn2vTE): 
File Not Found.

LB is using RoundRobin to distribute the load between servers.

Below, you can find the gluster configuration file used by all clients:

### file: client-volume.spec.sample

##############################################
###  GlusterFS Client Volume Specification  ##
##############################################

#### CONFIG FILE RULES:
### "#" is comment character.
### - Config file is case sensitive
### - Options within a volume block can be in any order.
### - Spaces or tabs are used as delimitter within a line.
### - Each option should end within a line.
### - Missing or commented fields will assume default values.
### - Blank/commented lines are allowed.
### - Sub-volumes should already be defined above before referring.

### Add client feature and attach to remote subvolume
volume client
 type protocol/client
 option transport-type tcp/client     # for TCP/IP transport
# option ib-verbs-work-request-send-size  1048576
# option ib-verbs-work-request-send-count 16
# option ib-verbs-work-request-recv-size  1048576
# option ib-verbs-work-request-recv-count 16
# option transport-type ib-sdp/client  # for Infiniband transport
# option transport-type ib-verbs/client # for ib-verbs transport
 option remote-host 1.2.3.4      # IP address of the remote brick
# option remote-port 6996              # default server port is 6996

# option transport-timeout 30          # seconds to wait for a reply
                                      # from server for each request
 option remote-subvolume attachments  # name of the remote volume
end-volume

### Add readahead feature
volume readahead
 type performance/read-ahead
 option page-size 1MB     # unit in bytes
 option page-count 2       # cache per file  = (page-count x page-size)
 subvolumes client
end-volume

### Add IO-Cache feature
volume iocache
 type performance/io-cache
 option page-size 256KB
 option page-count 2
 subvolumes readahead
end-volume

### Add writeback feature
#volume writeback
#  type performance/write-behind
#  option aggregate-size 1MB
#  option flush-behind off
#  subvolumes iocache
#end-volume

When I do the test manually, everything goes fine. What I think is 
happening is that gluster isn´t having enough time to sync all clients 
before clients trying to access the files (those servers are very busy 
ones.....they receive millions of requests per day). 

Is this configuration appropriate for this situation ? a bug ? a feature 
;-) ? Is there any option like the sync used in NFS that I can use in 
order guarantee that when the file is write down, all the clients already 
have it ?

TIA,

Claudio Cuqui