Re: Performance problems in our web server setup

"Bernhard J. M. Grün" <bernhard.gruen@xxxxxxxxxxxxxx> · Wed, 25 Jul 2007 10:11:07 +0200

Anand,

the amount of files is - at the moment - 13.000.000 image files. The
size of our cluster is 2x10TB (via AFR, so 10TB usable). This means
that we don't have enough of RAM. But we could use at most 2 GB of RAM
on each server for caching purposes.
Our small files consist of three different types: global galleries
(like event photos), user galleries and group galleries. The global
galleries are in my opinion the only files worth to cache and for
these global galleries only the newest files are interesting. On the
other hand there are two image sizes: thumbnails and full size images.
It would be best to cache only the small thumbnails because these wont
clutter the cache too much (size: about 4-5kb each).
At the moment we are using a squid cache in front of our web server to
do exactly this job but squid is not optimized for that much
connections. Therefore the io-cache should be exacty the feature we
want.

Thanks for your great help!

Bernhard

2007/7/24, Anand Avati <avati@xxxxxxxxxxxxx>:
Bernhard,
  io-cache on client side gives the best performance. the newer versions
will have advantages in loading io-cache on server as well, but for now it
is designed for client side.

As with the optimum size for io-cache, what is the total size of all the
combined images being served? would they all fit in your RAM? what is the
access pattern of these small files? how many bytes (of other files) are
generally accessed before a file is re-used? the io-cache currently uses an
LRU algorithm to age out old pages and files.

If they all fit in your RAM, then give a cache-size of that size plus some
10% extra slack. If they dont, let us continue the discussion based on the
access pattern (maybe your http access_log can give some kind of hints).

If need be, it should be possible to add some extra code into io-cache to
forcefully 'pin' cache pages of, say, '*.jpg' etc.

thanks,
avatai

2007/7/24, Bernhard J. M. Grün < bernhard.gruen@xxxxxxxxxxxxxx>:
> Anand, Harris,
>
> Thank you for your help. We'll try to migrate to mainline--2.5 this
> night. I really hope that it helps to speed up our setup. I'll send
> you the new configuration and also some new throughput benchmarks
> after some tests with the new version.
>
> But first I have some questions to the io-cache feature. Would you
> suggest to use it on the server or on the client side or even on both
> sides? The servers have 8GB of RAM each and the clients have 4GB of
> RAM each. So what would you suggest as good values for cache size in
> our scenario?
>
> I really hope the switch from mainline--2.4 to mainline--2.5 works well.
>
> Many thanks again for your work!
>
> Bernhard J. M. Grün
>
> 2007/7/24, Anand Avati <avati@xxxxxxxxxxxxx>:
> > Bernhard,
> >  Thanks for trying glusterfs! I have some questions/suggestions -
> >
> > 1. The read-ahead translator in glusterfs--mainline--2.4 used an 'always
> > aggressive' mode. Probably setting a lower page-count (2?) and a
page-size
> > of 131072 can help. If you are using gigabit ethernet, glusterfs can
peak
> > 1Gbps even without read-ahead. So you could infact try without
read-ahead as
> > well.
> >
> > 2. I would suggest you to try if the latest TLA on
glusterfs--mainline--2.5
> > works well for you, and if it does, use the io-cache translator on the
> > client side. For your scenario (serving lot of small files read-only)
> > io-cache should do a lot of good. If you can have a trial setup and see
how
> > well io-cache helps you, we will be very much in knowing your results
(and
> > if possible, some numbers)
> >
> > 3. Please try the patched fuse available at -
> >
http://ftp.zresearch.com/pub/gluster/glusterfs/fuse/fuse-2.7.0-glfs1.tar.gz
> >     This patched fuse greatly improves read performance, and we expect
it to
> > complement the io-cache feature very well.
> >
> > 4. About using multiple tcp connections, the load-balancer feature is in
our
> > roadmap where you can load balance over two network interfaces, or just
> > exploit multiple tcp connections over the same network interface. You
will
> > have to wait for the 1.4 release for this.
> >
> > thanks,
> > avati
> >
> > 2007/7/24, Bernhard J. M. Grün < bernhard.gruen@xxxxxxxxxxxxxx>:
> > >
> > > Hello!
> > >
> > > We experience some performance problems with our setup at the moment.
> > > And we would be happy if someone of you could help us out.
> > > This is our setup:
> > > Two clients connect to two servers that share the same data via AFR.
> > > The two servers hold about 13.000.000 smaller image files that are
> > > sent out to the web via the two clients.
> > > First I'll show you the configuration of the servers:
> > > volume brick
> > >   type storage/posix                   # POSIX FS translator
> > >   option directory /media/storage       # Export this directory
> > > end-volume
> > >
> > > volume iothreads    #iothreads can give performance a boost
> > >    type performance/io-threads
> > >    option thread-count 16
> > >    subvolumes brick
> > > end-volume
> > >
> > > ### Add network serving capability to above brick.
> > > volume server
> > >   type protocol/server
> > >   option transport-type tcp/server     # For TCP/IP transport
> > >   option listen-port 6996              # Default is 6996
> > >   option client-volume-filename
> > /opt/glusterfs/etc/glusterfs/client.vol
> > >   subvolumes iothreads
> > >   option auth.ip.iothreads.allow * # Allow access to "brick" volume
> > > end-volume
> > >
> > > Now the configuration of the clients:
> > > ### Add client feature and attach to remote subvolume
> > > volume client1
> > >   type protocol/client
> > >   option transport-type tcp/client     # for TCP/IP transport
> > >   option remote-host 10.1.1.13      # IP address of the remote brick
> > >   option remote-port 6996              # default server port is 6996
> > >   option remote-subvolume iothreads        # name of the remote volume
> > > end-volume
> > >
> > > ### Add client feature and attach to remote subvolume
> > > volume client2
> > >   type protocol/client
> > >   option transport-type tcp/client     # for TCP/IP transport
> > >   option remote-host 10.1.1.14     # IP address of the remote brick
> > >   option remote-port 6996              # default server port is 6996
> > >   option remote-subvolume iothreads        # name of the remote volume
> > > end-volume
> > >
> > > volume afrbricks
> > >   type cluster/afr
> > >   subvolumes client1 client2
> > >   option replicate *:2
> > > end-volume
> > >
> > > volume iothreads    #iothreads can give performance a boost
> > >    type performance/io-threads
> > >    option thread-count 8
> > >    subvolumes afrbricks
> > > end-volume
> > >
> > > ### Add writeback feature
> > > volume writeback
> > >   type performance/write-behind
> > >   option aggregate-size 0  # unit in bytes
> > >   subvolumes iothreads
> > > end-volume
> > >
> > > ### Add readahead feature
> > > volume bricks
> > >   type performance/read-ahead
> > >   option page-size 65536     # unit in bytes
> > >   option page-count 16       # cache per file  = (page-count x
page-size)
> > >   subvolumes writeback
> > > end-volume
> > >
> > > We use Lighttpd as web server to handle the web traffic and it seems
> > > that the image loading is quite slow. Also the used bandwidth between
> > > one client and its corresponding AFR-Server is low - about 12 MBit/s
> > > over a 1 GBit line. So there must be a bottleneck in our
> > > configuration. Maybe you can help us.
> > > At the moment we are using 1.3.0 (mainline--2.4 patch-131). At the
> > > moment we can't easily switch to mainline--2.5 because the servers are
> > > under high load.
> > >
> > > We also have seen that each client uses only one connection to each
> > > server. In my opinion this means that the iothreads subvolume on the
> > > client is (nearly) useless. Wouldn't it be better to establish more
> > > than just one connection to each server?
> > >
> > > Many thanks in advance
> > >
> > > Bernhard J. M. Grün
> > >
> > >
> > > _______________________________________________
> > > Gluster-devel mailing list
> > > Gluster-devel@xxxxxxxxxx
> > >
http://lists.nongnu.org/mailman/listinfo/gluster-devel
> > >
> >
> >
> >
> > --
> > Anand V. Avati
>

--
Anand V. Avati