> I'll try and find out. > > Also, is it the case that glusterfs will always be noticably > slower than NFS? For metadata operations NFS can be faster. But for file I/O our tests show that most of the times GlusterFS is faster, especially for block sizes > 64K. You should try with io-cache configured with enough cache-size to fit your dataset in RAM with some buffer (say 256MB in your case). thanks, avati > Chris, > > what cache-size did you configure in io-cache? Is it possible to share > > throughput benchmarks using dd (both read and write) also what is the > > io-zone performance at 128kB reclen? > > > > avati > > > > 2007/11/21, Chris Johnson <johnson@xxxxxxxxxxxxxxxxxxx>: > >> > >> On Wed, 21 Nov 2007, Chris Johnson wrote: > >> > >> Ok, caching and write-behind moved to the client side. There is some > >> improvement. > >> > >> > >> random random bkwd record stride > >> KB reclen write rewrite read reread read > >> write read rewrite read fwrite frewrite fread freread > >> 131072 32 312 312 361 363 1453 > >> 322 677 320 753 312 312 369 363 > >> > >> but as you can see it's marginal. Is this typical, i.e. being an > >> order of magnitude slower than NFS? > >> > >>> On Wed, 21 Nov 2007, Anand Avati wrote: > >>> > >>> See I asked if there was a philosophy about how to build a stack. > >>> Never got a response until now. > >>> > >>> Caching won't help in the real appication I don't believe. > >>> Mostly it's read, crunch, write. If I'm wrong here please let me > >>> know. Although I don't believe it will hurt. I'll give moving > >>> write-behind and io-cache to the client and see what happens. Does it > >>> matter how they're stacked, i.e. the which comes first? > >>> > >>>> You should also be loading io-cache on the client side with a decent > >>>> cache-size (like 256MB? depends on how much RAM you have to spare). > >> this > >>>> will help re-read improve a lot. > >>>> > >>>> avati > >>>> > >>>> 2007/11/21, Anand Avati <avati@xxxxxxxxxxxxx>: > >>>>> > >>>>> Chris, > >>>>> you shoud really be loading write-behind on the client side, that > is > >> wht > >>>>> improves write performance the most. do let us know the results with > >>>>> writebehind on the client side. > >>>>> > >>>>> avati > >>>>> > >>>>> 2007/11/21, Chris Johnson <johnson@xxxxxxxxxxxxxxxxxxx>: > >>>>>> > >>>>>> Hi, again, > >>>>>> > >>>>>> I asked about stack building philosophy. Apparently there > >> isn't > >>>>>> one. So I tried a few things. The configs are down the end here. > >>>>>> > >>>>>> Two systems, CentOS5, both running fuse-devel-2.7.0-1 gluster > >>>>>> enhanced, glusterfs-1.3.5-2. Both have gigabit ethernet, server > runs > >>>>>> a SATABeast. Currently I ge the following from from iozone. > >>>>>> > >>>>>> iozone -aN -r 32k -s 131072k -f /mnt/glusterfs/sdm1/junknstuff > >>>>>> > >>>>>> > >>>>>> random random bkwd record stride > >>>>>> KB reclen write rewrite read reread read > >>>>>> write read rewrite read fwrite frewrite fread freread > >>>>>> 131072 32 589 587 345 343 818 > >>>>>> 621 757 624 845 592 591 346 366 > >>>>>> > >>>>>> Now, a similar test using NFS on a CentOS4.4 system running a 3ware > >>>>>> RAID card gives this > >>>>>> > >>>>>> iozone -aN -r 32k -s 131072k -f /space/sake/5/admin/junknstuff > >>>>>> > >>>>>> > >>>>>> random random bkwd record stride > >>>>>> KB reclen write rewrite read reread read > >>>>>> write read rewrite read fwrite frewrite fread freread > >>>>>> 131072 32 27 26 292 > >>>>>> 11 11 24 542 9 539 30 28 > 295 > >>>>>> 11 > >>>>>> > >>>>>> And you can see that the NFS system is faster. Is this because of > >> the > >>>>>> hardware 3ware RAID or is NFS really that much faster here? Is > there > >>>>>> a better way to stack this that would improve things? And I tried > >> with > >>>>>> and without striping. No noticable difference in gluster > >> performance. > >>>>>> > >>>>>> Help appreciated. > >>>>>> > >>>>>> ============ server config > >>>>>> > >>>>>> volume brick1 > >>>>>> type storage/posix > >>>>>> option directory /home/sdm1 > >>>>>> end-volume > >>>>>> > >>>>>> volume brick2 > >>>>>> type storage/posix > >>>>>> option directory /home/sdl1 > >>>>>> end-volume > >>>>>> > >>>>>> volume brick3 > >>>>>> type storage/posix > >>>>>> option directory /home/sdk1 > >>>>>> end-volume > >>>>>> > >>>>>> volume brick4 > >>>>>> type storage/posix > >>>>>> option directory /home/sdk1 > >>>>>> end-volume > >>>>>> > >>>>>> volume ns-brick > >>>>>> type storage/posix > >>>>>> option directory /home/sdk1 > >>>>>> end-volume > >>>>>> > >>>>>> volume stripe1 > >>>>>> type cluster/stripe > >>>>>> subvolumes brick1 brick2 > >>>>>> # option block-size *:10KB, > >>>>>> end-volume > >>>>>> > >>>>>> volume stripe2 > >>>>>> type cluster/stripe > >>>>>> subvolumes brick3 brick4 > >>>>>> # option block-size *:10KB, > >>>>>> end-volume > >>>>>> > >>>>>> volume unify0 > >>>>>> type cluster/unify > >>>>>> subvolumes stripe1 stripe2 > >>>>>> option namespace ns-brick > >>>>>> option scheduler rr > >>>>>> # option rr.limits.min-disk-free 5 > >>>>>> end-volume > >>>>>> > >>>>>> volume iot > >>>>>> type performance/io-threads > >>>>>> subvolumes unify0 > >>>>>> option thread-count 8 > >>>>>> end-volume > >>>>>> > >>>>>> volume writebehind > >>>>>> type performance/write-behind > >>>>>> option aggregate-size 131072 # in bytes > >>>>>> subvolumes iot > >>>>>> end-volume > >>>>>> > >>>>>> volume readahead > >>>>>> type performance/read-ahead > >>>>>> # option page-size 65536 ### in bytes > >>>>>> option page-size 128kb ### in bytes > >>>>>> # option page-count 16 ### memory cache size is page-count x > >>>>>> page-size per file > >>>>>> option page-count 2 ### memory cache size is page-count x > >> page-size > >>>>>> per file > >>>>>> subvolumes writebehind > >>>>>> end-volume > >>>>>> > >>>>>> volume server > >>>>>> type protocol/server > >>>>>> subvolumes readahead > >>>>>> option transport-type tcp/server # For TCP/IP transport > >>>>>> # option client-volume-filename /etc/glusterfs/glusterfs- > client.vol > >>>>>> option auth.ip.readahead.allow * > >>>>>> end-volume > >>>>>> > >>>>>> > >>>>>> ============ client config > >>>>>> > >>>>>> volume client > >>>>>> type protocol/client > >>>>>> option transport-type tcp/client > >>>>>> option remote-host xxx.xxx.xxx.xxx > >>>>>> option remote-subvolume readahead > >>>>>> end-volume > >>>>>> > >>>>>> > >>>>>> > >> > ------------------------------------------------------------------------------- > >>>>>> > >>>>>> Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > >>>>>> Systems Administrator |Web: > >>>>>> http://www.nmr.mgh.harvard.edu/~johnson > >>>>>> <http://www.nmr.mgh.harvard.edu/%7Ejohnson> > >>>>>> NMR Center |Voice: 617.726.0949 > >>>>>> Mass. General Hospital |FAX: 617.726.7422 > >>>>>> 149 (2301) 13th Street |A compromise is a solution nobody is > >> happy > >>>>>> with. > >>>>>> Charlestown, MA., 02129 USA | Observation, Unknown > >>>>>> > >>>>>> > >>>>>> > >> > ------------------------------------------------------------------------------- > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Gluster-devel mailing list > >>>>>> Gluster-devel@xxxxxxxxxx > >>>>>> http://lists.nongnu.org/mailman/listinfo/gluster-devel > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> It always takes longer than you expect, even when you take into > >> account > >>>>> Hofstadter's Law. > >>>>> > >>>>> -- Hofstadter's Law > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> It always takes longer than you expect, even when you take into > account > >>>> Hofstadter's Law. > >>>> > >>>> -- Hofstadter's Law > >>>> > >>> > >>> > >> > ------------------------------------------------------------------------------- > >>> Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > >>> Systems Administrator |Web: > >>> http://www.nmr.mgh.harvard.edu/~johnson > >>> NMR Center |Voice: 617.726.0949 > >>> Mass. General Hospital |FAX: 617.726.7422 > >>> 149 (2301) 13th Street |For all sad words of tongue or pen, the > >> saddest > >>> Charlestown, MA., 02129 USA |are these: "It might have been". John G. > >>> Whittier > >>> > >> > ------------------------------------------------------------------------------- > >>> > >>> > >>> > >> > >> > >> > ------------------------------------------------------------------------------- > >> Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > >> Systems Administrator |Web: > >> http://www.nmr.mgh.harvard.edu/~johnson > >> NMR Center |Voice: 617.726.0949 > >> Mass. General Hospital |FAX: 617.726.7422 > >> 149 (2301) 13th Street |Fifty percent of all doctors graduated in > the > >> Charlestown, MA., 02129 USA |lower half of the class. Observation > >> > >> > ------------------------------------------------------------------------------- > >> > > > > > > > > -- > > It always takes longer than you expect, even when you take into account > > Hofstadter's Law. > > > > -- Hofstadter's Law > > > > > ------------------------------------------------------------------------------- > Chris Johnson |Internet: johnson@xxxxxxxxxxxxxxxxxxx > Systems Administrator |Web: > http://www.nmr.mgh.harvard.edu/~johnson > NMR Center |Voice: 617.726.0949 > Mass. General Hospital |FAX: 617.726.7422 > 149 (2301) 13th Street |"A good engineer never reinvents the wheel > when > Charlestown, MA., 02129 USA |an existing one with modifications will do." > Me > > ------------------------------------------------------------------------------- > -- It always takes longer than you expect, even when you take into account Hofstadter's Law. -- Hofstadter's Law