Bernhard, Some updates since my last mail. Based on the discussion we have had, we felt it was very appropriate for the user to specify priorities to io-cache. The latest TLA checkout of io-cache supports priority specification like: 'option priority */global/*.jpg:3,*/thumbnails/*.jpg:2,*.html:1' etc. As a small insight into what they mean - content of a file is cached with the priority with which it is specifed in the spec file. by default a file has priority '0' - higher the priority value, more affinity it has to stay in cache. - when the cache-size limit is reached, cache pruning is done starting with the lowest priority. within a priority, files are maintained in an LRU list. when all pages of a given priority are pruned only then the next higher priority is considered. It is advisable to use priority numbers starting 1, incrementing by 1. What I mean is providing priorities like 100,10,1 for your three types is not recommended, instead use 3,2,1. I hope this comes of use for your setup. thanks, avati 2007/7/25, Bernhard J. M. Grün <bernhard.gruen@xxxxxxxxxxxxxx>:
Anand, the amount of files is - at the moment - 13.000.000 image files. The size of our cluster is 2x10TB (via AFR, so 10TB usable). This means that we don't have enough of RAM. But we could use at most 2 GB of RAM on each server for caching purposes. Our small files consist of three different types: global galleries (like event photos), user galleries and group galleries. The global galleries are in my opinion the only files worth to cache and for these global galleries only the newest files are interesting. On the other hand there are two image sizes: thumbnails and full size images. It would be best to cache only the small thumbnails because these wont clutter the cache too much (size: about 4-5kb each). At the moment we are using a squid cache in front of our web server to do exactly this job but squid is not optimized for that much connections. Therefore the io-cache should be exacty the feature we want. Thanks for your great help! Bernhard 2007/7/24, Anand Avati <avati@xxxxxxxxxxxxx>: > Bernhard, > io-cache on client side gives the best performance. the newer versions > will have advantages in loading io-cache on server as well, but for now it > is designed for client side. > > As with the optimum size for io-cache, what is the total size of all the > combined images being served? would they all fit in your RAM? what is the > access pattern of these small files? how many bytes (of other files) are > generally accessed before a file is re-used? the io-cache currently uses an > LRU algorithm to age out old pages and files. > > If they all fit in your RAM, then give a cache-size of that size plus some > 10% extra slack. If they dont, let us continue the discussion based on the > access pattern (maybe your http access_log can give some kind of hints). > > If need be, it should be possible to add some extra code into io-cache to > forcefully 'pin' cache pages of, say, '*.jpg' etc. > > thanks, > avatai > > > 2007/7/24, Bernhard J. M. Grün < bernhard.gruen@xxxxxxxxxxxxxx>: > > Anand, Harris, > > > > Thank you for your help. We'll try to migrate to mainline--2.5 this > > night. I really hope that it helps to speed up our setup. I'll send > > you the new configuration and also some new throughput benchmarks > > after some tests with the new version. > > > > But first I have some questions to the io-cache feature. Would you > > suggest to use it on the server or on the client side or even on both > > sides? The servers have 8GB of RAM each and the clients have 4GB of > > RAM each. So what would you suggest as good values for cache size in > > our scenario? > > > > I really hope the switch from mainline--2.4 to mainline--2.5 works well. > > > > Many thanks again for your work! > > > > Bernhard J. M. Grün > > > > 2007/7/24, Anand Avati <avati@xxxxxxxxxxxxx>: > > > Bernhard, > > > Thanks for trying glusterfs! I have some questions/suggestions - > > > > > > 1. The read-ahead translator in glusterfs--mainline--2.4 used an 'always > > > aggressive' mode. Probably setting a lower page-count (2?) and a > page-size > > > of 131072 can help. If you are using gigabit ethernet, glusterfs can > peak > > > 1Gbps even without read-ahead. So you could infact try without > read-ahead as > > > well. > > > > > > 2. I would suggest you to try if the latest TLA on > glusterfs--mainline--2.5 > > > works well for you, and if it does, use the io-cache translator on the > > > client side. For your scenario (serving lot of small files read-only) > > > io-cache should do a lot of good. If you can have a trial setup and see > how > > > well io-cache helps you, we will be very much in knowing your results > (and > > > if possible, some numbers) > > > > > > 3. Please try the patched fuse available at - > > > > http://ftp.zresearch.com/pub/gluster/glusterfs/fuse/fuse-2.7.0-glfs1.tar.gz > > > This patched fuse greatly improves read performance, and we expect > it to > > > complement the io-cache feature very well. > > > > > > 4. About using multiple tcp connections, the load-balancer feature is in > our > > > roadmap where you can load balance over two network interfaces, or just > > > exploit multiple tcp connections over the same network interface. You > will > > > have to wait for the 1.4 release for this. > > > > > > thanks, > > > avati > > > > > > 2007/7/24, Bernhard J. M. Grün < bernhard.gruen@xxxxxxxxxxxxxx>: > > > > > > > > Hello! > > > > > > > > We experience some performance problems with our setup at the moment. > > > > And we would be happy if someone of you could help us out. > > > > This is our setup: > > > > Two clients connect to two servers that share the same data via AFR. > > > > The two servers hold about 13.000.000 smaller image files that are > > > > sent out to the web via the two clients. > > > > First I'll show you the configuration of the servers: > > > > volume brick > > > > type storage/posix # POSIX FS translator > > > > option directory /media/storage # Export this directory > > > > end-volume > > > > > > > > volume iothreads #iothreads can give performance a boost > > > > type performance/io-threads > > > > option thread-count 16 > > > > subvolumes brick > > > > end-volume > > > > > > > > ### Add network serving capability to above brick. > > > > volume server > > > > type protocol/server > > > > option transport-type tcp/server # For TCP/IP transport > > > > option listen-port 6996 # Default is 6996 > > > > option client-volume-filename > > > /opt/glusterfs/etc/glusterfs/client.vol > > > > subvolumes iothreads > > > > option auth.ip.iothreads.allow * # Allow access to "brick" volume > > > > end-volume > > > > > > > > Now the configuration of the clients: > > > > ### Add client feature and attach to remote subvolume > > > > volume client1 > > > > type protocol/client > > > > option transport-type tcp/client # for TCP/IP transport > > > > option remote-host 10.1.1.13 # IP address of the remote brick > > > > option remote-port 6996 # default server port is 6996 > > > > option remote-subvolume iothreads # name of the remote volume > > > > end-volume > > > > > > > > ### Add client feature and attach to remote subvolume > > > > volume client2 > > > > type protocol/client > > > > option transport-type tcp/client # for TCP/IP transport > > > > option remote-host 10.1.1.14 # IP address of the remote brick > > > > option remote-port 6996 # default server port is 6996 > > > > option remote-subvolume iothreads # name of the remote volume > > > > end-volume > > > > > > > > volume afrbricks > > > > type cluster/afr > > > > subvolumes client1 client2 > > > > option replicate *:2 > > > > end-volume > > > > > > > > volume iothreads #iothreads can give performance a boost > > > > type performance/io-threads > > > > option thread-count 8 > > > > subvolumes afrbricks > > > > end-volume > > > > > > > > ### Add writeback feature > > > > volume writeback > > > > type performance/write-behind > > > > option aggregate-size 0 # unit in bytes > > > > subvolumes iothreads > > > > end-volume > > > > > > > > ### Add readahead feature > > > > volume bricks > > > > type performance/read-ahead > > > > option page-size 65536 # unit in bytes > > > > option page-count 16 # cache per file = (page-count x > page-size) > > > > subvolumes writeback > > > > end-volume > > > > > > > > We use Lighttpd as web server to handle the web traffic and it seems > > > > that the image loading is quite slow. Also the used bandwidth between > > > > one client and its corresponding AFR-Server is low - about 12 MBit/s > > > > over a 1 GBit line. So there must be a bottleneck in our > > > > configuration. Maybe you can help us. > > > > At the moment we are using 1.3.0 (mainline--2.4 patch-131). At the > > > > moment we can't easily switch to mainline--2.5 because the servers are > > > > under high load. > > > > > > > > We also have seen that each client uses only one connection to each > > > > server. In my opinion this means that the iothreads subvolume on the > > > > client is (nearly) useless. Wouldn't it be better to establish more > > > > than just one connection to each server? > > > > > > > > Many thanks in advance > > > > > > > > Bernhard J. M. Grün > > > > > > > > > > > > _______________________________________________ > > > > Gluster-devel mailing list > > > > Gluster-devel@xxxxxxxxxx > > > > > http://lists.nongnu.org/mailman/listinfo/gluster-devel > > > > > > > > > > > > > > > > -- > > > Anand V. Avati > > > > > > -- > Anand V. Avati
-- Anand V. Avati