Here are some more details with different configs: * Only AFR between cfs1 & cfs2: root at dev1# time cp -rp * /mnt/ real 16m45.995s user 0m1.104s sys 0m5.528s * Single server - cfs1: root at dev1# time cp -rp * /mnt/ real 10m33.967s user 0m0.764s sys 0m5.516s * Stats via bmon on cfs1 during above copy: # Interface RX Rate RX # TX Rate TX # ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ???????????????????????? cfs1 (source: local) 0 eth1 951.25KiB 1892 254.00KiB 1633 It gets progressively better, but that's still a *long* way from <2 min times with scp & <1 min times with rsync! And, I have no redundancy or distributed hash whatsoever. * Client config for the last test: ----- # Webform Flat-File Cache Volume client configuration volume srv1 type protocol/client option transport-type tcp option remote-host cfs1 option remote-subvolume webform_cache_brick end-volume volume writebehind type performance/write-behind option cache-size 4mb option flush-behind on subvolumes srv1 end-volume volume cache type performance/io-cache option cache-size 512mb subvolumes writebehind end-volume ----- Ben On Jun 3, 2009, at 4:33 PM, Vahri? Muhtaryan wrote: > For better understanding issue did you try 4 servers DHT only or 2 > servers > DHT only or two servers replication only for find out real problem > maybe > replication or dht could have a bug ? > > -----Original Message----- > From: gluster-users-bounces at gluster.org > [mailto:gluster-users-bounces at gluster.org] On Behalf Of Benjamin Krein > Sent: Wednesday, June 03, 2009 11:00 PM > To: Jasper van Wanrooy - Chatventure > Cc: gluster-users at gluster.org > Subject: Re: Horrible performance with small files > (DHT/AFR) > > The current boxes I'm using for testing are as follows: > > * 2x dual-core Opteron ~2GHz (x86_64) > * 4GB RAM > * 4x 7200 RPM 73GB SATA - RAID1+0 w/3ware hardware controllers > > The server storage directories live in /home/clusterfs where /home is > an ext3 partition mounted with noatime. > > These servers are not virtualized. They are running Ubuntu 8.04 LTS > Server x86_64. > > The files I'm copying are all <2k javascript files (plain text) stored > in 100 hash directories in each of 3 parent directories: > > /home/clusterfs/ > + parentdir1/ > | + 00/ > | | ... > | + 99/ > + parentdir1/ > | + 00/ > | | ... > | + 99/ > + parentdir1/ > + 00/ > | ... > + 99/ > > There are ~10k of these <2k javascript files distributed throughout > the above directory structure totaling approximately 570MB. My tests > have been copying that entire directory structure from a client > machine into the glusterfs mountpoint on the client. > > Observing IO on both the client box & all the server boxes via iostat > shows that the disks are doing *very* little work. Observing the CPU/ > memory load with top or htop shows that none of the boxes are CPU or > memory bound. Observing the bandwidth in/out of the network interface > shows <1MB/s throughput (we have a fully gigabit LAN!) which usually > drops down to <150KB/s during the copy. > > scp'ing the same directory structure from the same client to one of > the same servers will work at ~40-50MB/s sustained as a comparison. > Here is the results of copying the same directory structure using > rsync to the same partition: > > # time rsync -ap * benk at cfs1:~/cache/ > benk at cfs1's password: > > real 0m23.566s > user 0m8.433s > sys 0m4.580s > > Ben > > On Jun 3, 2009, at 3:16 PM, Jasper van Wanrooy - Chatventure wrote: > >> Hi Benjamin, >> >> That's not good news. What kind of hardware do you use? Is it >> virtualised? Or do you use real boxes? >> What kind of files are you copying in your test? What performance do >> you have when copying it to a local dir? >> >> Best regards Jasper >> >> ----- Original Message ----- >> From: "Benjamin Krein" <superbenk at superk.org> >> To: "Jasper van Wanrooy - Chatventure" <jvanwanrooy at chatventure.nl> >> Cc: "Vijay Bellur" <vijay at gluster.com>, gluster-users at gluster.org >> Sent: Wednesday, 3 June, 2009 19:23:51 GMT +01:00 Amsterdam / >> Berlin / Bern / Rome / Stockholm / Vienna >> Subject: Re: Horrible performance with small files >> (DHT/AFR) >> >> I reduced my config to only 2 servers (had to donate 2 of the 4 to >> another project). I now have a single server using DHT (for future >> scaling) and AFR to a mirrored server. Copy times are much better, >> but still pretty horrible: >> >> # time cp -rp * /mnt/ >> >> real 21m11.505s >> user 0m1.000s >> sys 0m6.416s >> >> Ben >> >> On Jun 3, 2009, at 3:13 AM, Jasper van Wanrooy - Chatventure wrote: >> >>> Hi Benjamin, >>> >>> Did you also try with a lower thread-count. Actually I'm using 3 >>> threads. >>> >>> Best Regards Jasper >>> >>> >>> On 2 jun 2009, at 18:25, Benjamin Krein wrote: >>> >>>> I do not see any difference with autoscaling removed. Current >>>> server config: >>>> >>>> # webform flat-file cache >>>> >>>> volume webform_cache >>>> type storage/posix >>>> option directory /home/clusterfs/webform/cache >>>> end-volume >>>> >>>> volume webform_cache_locks >>>> type features/locks >>>> subvolumes webform_cache >>>> end-volume >>>> >>>> volume webform_cache_brick >>>> type performance/io-threads >>>> option thread-count 32 >>>> subvolumes webform_cache_locks >>>> end-volume >>>> >>>> <<snip>> >>>> >>>> # GlusterFS Server >>>> volume server >>>> type protocol/server >>>> option transport-type tcp >>>> subvolumes dns_public_brick dns_private_brick webform_usage_brick >>>> webform_cache_brick wordpress_uploads_brick subs_exports_brick >>>> option auth.addr.dns_public_brick.allow 10.1.1.* >>>> option auth.addr.dns_private_brick.allow 10.1.1.* >>>> option auth.addr.webform_usage_brick.allow 10.1.1.* >>>> option auth.addr.webform_cache_brick.allow 10.1.1.* >>>> option auth.addr.wordpress_uploads_brick.allow 10.1.1.* >>>> option auth.addr.subs_exports_brick.allow 10.1.1.* >>>> end-volume >>>> >>>> # time cp -rp * /mnt/ >>>> >>>> real 70m13.672s >>>> user 0m1.168s >>>> sys 0m8.377s >>>> >>>> NOTE: the above test was also done during peak hours when the LAN/ >>>> dev server were in use which would cause some of the extra time. >>>> This is still WAY too much, though. >>>> >>>> Ben >>>> >>>> >>>> On Jun 1, 2009, at 1:40 PM, Vijay Bellur wrote: >>>> >>>>> Hi Benjamin, >>>>> >>>>> Could you please try by turning autoscaling off? >>>>> >>>>> Thanks, >>>>> Vijay >>>>> >>>>> Benjamin Krein wrote: >>>>>> I'm seeing extremely poor performance writing small files to a >>>>>> glusterfs DHT/AFR mount point. Here are the stats I'm seeing: >>>>>> >>>>>> * Number of files: >>>>>> root at dev1|/home/aweber/cache|# find |wc -l >>>>>> 102440 >>>>>> >>>>>> * Average file size (bytes): >>>>>> root at dev1|/home/aweber/cache|# ls -lR | awk '{sum += $5; n++;} >>>>>> END {print sum/n;}' >>>>>> 4776.47 >>>>>> >>>>>> * Using scp: >>>>>> root at dev1|/home/aweber/cache|# time scp -rp * benk at cfs1:~/cache/ >>>>>> >>>>>> real 1m38.726s >>>>>> user 0m12.173s >>>>>> sys 0m12.141s >>>>>> >>>>>> * Using cp to glusterfs mount point: >>>>>> root at dev1|/home/aweber/cache|# time cp -rp * /mnt >>>>>> >>>>>> real 30m59.101s >>>>>> user 0m1.296s >>>>>> sys 0m5.820s >>>>>> >>>>>> Here is my configuration (currently, single client writing to 4 >>>>>> servers (2 DHT servers doing AFR): >>>>>> >>>>>> SERVER: >>>>>> >>>>>> # webform flat-file cache >>>>>> >>>>>> volume webform_cache >>>>>> type storage/posix >>>>>> option directory /home/clusterfs/webform/cache >>>>>> end-volume >>>>>> >>>>>> volume webform_cache_locks >>>>>> type features/locks >>>>>> subvolumes webform_cache >>>>>> end-volume >>>>>> >>>>>> volume webform_cache_brick >>>>>> type performance/io-threads >>>>>> option thread-count 32 >>>>>> option max-threads 128 >>>>>> option autoscaling on >>>>>> subvolumes webform_cache_locks >>>>>> end-volume >>>>>> >>>>>> <<snip>> >>>>>> >>>>>> # GlusterFS Server >>>>>> volume server >>>>>> type protocol/server >>>>>> option transport-type tcp >>>>>> subvolumes dns_public_brick dns_private_brick webform_usage_brick >>>>>> webform_cache_brick wordpress_uploads_brick subs_exports_brick >>>>>> option auth.addr.dns_public_brick.allow 10.1.1.* >>>>>> option auth.addr.dns_private_brick.allow 10.1.1.* >>>>>> option auth.addr.webform_usage_brick.allow 10.1.1.* >>>>>> option auth.addr.webform_cache_brick.allow 10.1.1.* >>>>>> option auth.addr.wordpress_uploads_brick.allow 10.1.1.* >>>>>> option auth.addr.subs_exports_brick.allow 10.1.1.* >>>>>> end-volume >>>>>> >>>>>> CLIENT: >>>>>> >>>>>> # Webform Flat-File Cache Volume client configuration >>>>>> >>>>>> volume srv1 >>>>>> type protocol/client >>>>>> option transport-type tcp >>>>>> option remote-host cfs1 >>>>>> option remote-subvolume webform_cache_brick >>>>>> end-volume >>>>>> >>>>>> volume srv2 >>>>>> type protocol/client >>>>>> option transport-type tcp >>>>>> option remote-host cfs2 >>>>>> option remote-subvolume webform_cache_brick >>>>>> end-volume >>>>>> >>>>>> volume srv3 >>>>>> type protocol/client >>>>>> option transport-type tcp >>>>>> option remote-host cfs3 >>>>>> option remote-subvolume webform_cache_brick >>>>>> end-volume >>>>>> >>>>>> volume srv4 >>>>>> type protocol/client >>>>>> option transport-type tcp >>>>>> option remote-host cfs4 >>>>>> option remote-subvolume webform_cache_brick >>>>>> end-volume >>>>>> >>>>>> volume afr1 >>>>>> type cluster/afr >>>>>> subvolumes srv1 srv3 >>>>>> end-volume >>>>>> >>>>>> volume afr2 >>>>>> type cluster/afr >>>>>> subvolumes srv2 srv4 >>>>>> end-volume >>>>>> >>>>>> volume dist >>>>>> type cluster/distribute >>>>>> subvolumes afr1 afr2 >>>>>> end-volume >>>>>> >>>>>> volume writebehind >>>>>> type performance/write-behind >>>>>> option cache-size 4mb >>>>>> option flush-behind on >>>>>> subvolumes dist >>>>>> end-volume >>>>>> >>>>>> volume cache >>>>>> type performance/io-cache >>>>>> option cache-size 512mb >>>>>> subvolumes writebehind >>>>>> end-volume >>>>>> >>>>>> Benjamin Krein >>>>>> www.superk.org >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users >>> >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users >