Chris, what cache-size did you configure in io-cache? Is it possible to share throughput benchmarks using dd (both read and write) also what is the io-zone performance at 128kB reclen? avati 2007/11/21, Chris Johnson <johnson@xxxxxxxxxxxxxxxxxxx>: > > On Wed, 21 Nov 2007, Chris Johnson wrote: > > Ok, caching and write-behind moved to the client side. There is some > improvement. > > > random random bkwd record stride > KB reclen write rewrite read reread read > write read rewrite read fwrite frewrite fread freread > 131072 32 312 312 361 363 1453 > 322 677 320 753 312 312 369 363 > > but as you can see it's marginal. Is this typical, i.e. being an > order of magnitude slower than NFS? > > > On Wed, 21 Nov 2007, Anand Avati wrote: > > > > See I asked if there was a philosophy about how to build a stack. > > Never got a response until now. > > > > Caching won't help in the real appication I don't believe. > > Mostly it's read, crunch, write. If I'm wrong here please let me > > know. Although I don't believe it will hurt. I'll give moving > > write-behind and io-cache to the client and see what happens. Does it > > matter how they're stacked, i.e. the which comes first? > > > >> You should also be loading io-cache on the client side with a decent > >> cache-size (like 256MB? depends on how much RAM you have to spare). > this > >> will help re-read improve a lot. > >> > >> avati > >> > >> 2007/11/21, Anand Avati <avati@xxxxxxxxxxxxx>: > >>> > >>> Chris, > >>> you shoud really be loading write-behind on the client side, that is > wht > >>> improves write performance the most. do let us know the results with > >>> writebehind on the client side. > >>> > >>> avati > >>> > >>> 2007/11/21, Chris Johnson <johnson@xxxxxxxxxxxxxxxxxxx>: > >>>> > >>>> Hi, again, > >>>> > >>>> I asked about stack building philosophy. Apparently there > isn't > >>>> one. So I tried a few things. The configs are down the end here. > >>>> > >>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster > >>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server runs > >>>> a SATABeast. Currently I ge the following from from iozone. > >>>> > >>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff > >>>> > >>>> > >>>> random random bkwd record stride > >>>> KB reclen write rewrite read reread read > >>>> write read rewrite read fwrite frewrite fread freread > >>>> 131072 32 589 587 345 343 818 > >>>> 621 757 624 845 592 591 346 366 > >>>> > >>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware > >>>> RAID card gives this > >>>> > >>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff > >>>> > >>>> > >>>> random random bkwd record stride > >>>> KB reclen write rewrite read reread read > >>>> write read rewrite read fwrite frewrite fread freread > >>>> 131072 32 27 26 292 > >>>> 11 11 24 542 9 539 30 28 295 > >>>> 11 > >>>> > >>>> And you can see that the NFS system is faster. Is this because of > the > >>>> hardware 3ware RAID or is NFS really that much faster here? Is there > >>>> a better way to stack this that would improve things? And I tried > with > >>>> and without striping. No noticable difference in gluster > performance. > >>>> > >>>> Help appreciated. > >>>> > >>>> ============ server config > >>>> > >>>> volume brick1 > >>>> type storage/posix > >>>> option directory /home/sdm1 > >>>> end-volume > >>>> > >>>> volume brick2 > >>>> type storage/posix > >>>> option directory /home/sdl1 > >>>> end-volume > >>>> > >>>> volume brick3 > >>>> type storage/posix > >>>> option directory /home/sdk1 > >>>> end-volume > >>>> > >>>> volume brick4 > >>>> type storage/posix > >>>> option directory /home/sdk1 > >>>> end-volume > >>>> > >>>> volume ns-brick > >>>> type storage/posix > >>>> option directory /home/sdk1 > >>>> end-volume > >>>> > >>>> volume stripe1 > >>>> type cluster/stripe > >>>> subvolumes brick1 brick2 > >>>> # option block-size *:10KB, > >>>> end-volume > >>>> > >>>> volume stripe2 > >>>> type cluster/stripe > >>>> subvolumes brick3 brick4 > >>>> # option block-size *:10KB, > >>>> end-volume > >>>> > >>>> volume unify0 > >>>> type cluster/unify > >>>> subvolumes stripe1 stripe2 > >>>> option namespace ns-brick > >>>> option scheduler rr > >>>> # option rr.limits.min-disk-free 5 > >>>> end-volume > >>>> > >>>> volume iot > >>>> type performance/io-threads > >>>> subvolumes unify0 > >>>> option thread-count 8 > >>>> end-volume > >>>> > >>>> volume writebehind > >>>> type performance/write-behind > >>>> option aggregate-size 131072 # in bytes > >>>> subvolumes iot > >>>> end-volume > >>>> > >>>> volume readahead > >>>> type performance/read-ahead > >>>> # option page-size 65536 ### in bytes > >>>> option page-size 128kb ### in bytes > >>>> # option page-count 16 ### memory cache size is page-count x > >>>> page-size per file > >>>> option page-count 2 ### memory cache size is page-count x > page-size > >>>> per file > >>>> subvolumes writebehind > >>>> end-volume > >>>> > >>>> volume server > >>>> type protocol/server > >>>> subvolumes readahead > >>>> option transport-type tcp/server # For TCP/IP transport > >>>> # option client-volume-filename /etc/glusterfs/glusterfs-client.vol > >>>> option auth.ip.readahead.allow * > >>>> end-volume > >>>> > >>>> > >>>> ============ client config > >>>> > >>>> volume client > >>>> type protocol/client > >>>> option transport-type tcp/client > >>>> option remote-host xxx.xxx.xxx.xxx > >>>> option remote-subvolume readahead > >>>> end-volume > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------- > >>>> > >>>> Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > >>>> Systems Administrator |Web: > >>>> http://www.nmr.mgh.harvard.edu/~johnson > >>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson> > >>>> NMR Center |Voice: 617.726.0949 > >>>> Mass. General Hospital |FAX: 617.726.7422 > >>>> 149 (2301) 13th Street |A compromise is a solution nobody is > happy > >>>> with. > >>>> Charlestown, MA., 02129 USA | Observation, Unknown > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------- > >>>> > >>>> > >>>> _______________________________________________ > >>>> Gluster-devel mailing list > >>>> Gluster-devel@xxxxxxxxxx > >>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel > >>>> > >>> > >>> > >>> > >>> -- > >>> It always takes longer than you expect, even when you take into > account > >>> Hofstadter's Law. > >>> > >>> -- Hofstadter's Law > >> > >> > >> > >> > >> -- > >> It always takes longer than you expect, even when you take into account > >> Hofstadter's Law. > >> > >> -- Hofstadter's Law > >> > > > > > ------------------------------------------------------------------------------- > > Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > > Systems Administrator |Web: > > http://www.nmr.mgh.harvard.edu/~johnson > > NMR Center |Voice: 617.726.0949 > > Mass. General Hospital |FAX: 617.726.7422 > > 149 (2301) 13th Street |For all sad words of tongue or pen, the > saddest > > Charlestown, MA., 02129 USA |are these: "It might have been". John G. > > Whittier > > > ------------------------------------------------------------------------------- > > > > > > > > > ------------------------------------------------------------------------------- > Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > Systems Administrator |Web: > http://www.nmr.mgh.harvard.edu/~johnson > NMR Center |Voice: 617.726.0949 > Mass. General Hospital |FAX: 617.726.7422 > 149 (2301) 13th Street |Fifty percent of all doctors graduated in the > Charlestown, MA., 02129 USA |lower half of the class. Observation > > ------------------------------------------------------------------------------- > -- It always takes longer than you expect, even when you take into account Hofstadter's Law. -- Hofstadter's Law