Hi Nick, Thank you for using Gluster and sending us such detailed description of the problem you are seeing. We will try a run with exactly the same switches and config as you mention and see if we can reproduce this inhouse to make debugging easier. Regards, Tejas. ----- Original Message ----- From: "Nick Birkett" <nick at streamline-computing.com> To: gluster-users at gluster.org Sent: Wednesday, December 23, 2009 3:04:43 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi Subject: gluster 3.0 read hangs I ran some benchmarks last week using 2.0.8. Single server with 8 Intel e1000e bonded mode=balance-alb All worked fine and I got some good results using 8 clients. All Gigabit. The benchmarks did 2 passes of IOZONE in network mode using 1-8 threads per client and using 1 - 8 clients. Each client used 32Gbyte files. All jobs completed successfully. This takes about 32 hours to run through all cases. Yesterday I updated to 3.0.0 (server and clients) and re-configured the server and client vol files using glusterfs-volgen (renamed some of the vol names). RedHat EL5 binary packages from Glusterfs site installed glusterfs-server-3.0.0-1.x86_64 glusterfs-common-3.0.0-1.x86_64 glusterfs-client-3.0.0-1.x86_64 All works mainly ok, except every so often the IOZONE job just stops. The network IO drops to zero. This is always happens during either a read or re-read test. It happes just as the IOZONE read test starts. It doesnt happen every time and it may run for several hours without incident. This has happened 6 times on different test cases (thread/clients). Anyone else noticed this ? Perhaps I have done something wrong ? vol files attached - I know I dont need to distribute 1 remote vol - part of larger test with multiple vols. Attached sample outputs. 4 clients 4 files per client ran fine. 4 clients 8 files per client hung at re-read on 2nd pass of IOZONE. All jobs with 5 clients and 8 clients ran to completion. Thanks, Nick This e-mail message may contain confidential and/or privileged information. If you are not an addressee or otherwise authorized to receive this message, you should not use, copy, disclose or take any action based on this e-mail or any information contained in the message. If you have received this material in error, please advise the sender immediately by reply e-mail and delete this message. Thank you. Streamline Computing is a trading division of Concurrent Thinking Limited: Registered in England and Wales No: 03913912 Registered Address: The Innovation Centre, Warwick Technology Park, Gallows Hill, Warwick, CV34 6UW, United Kingdom volume brick00.server-e type protocol/client option transport-type tcp option transport.socket.nodelay on option transport.remote-port 6996 option remote-host 192.168.100.200 # can be IP or hostname option remote-subvolume brick00 end-volume volume distribute type cluster/distribute subvolumes brick00.server-e end-volume volume writebehind type performance/write-behind option cache-size 4MB subvolumes distribute end-volume volume readahead type performance/read-ahead option page-count 4 subvolumes writebehind end-volume volume iocache type performance/io-cache option cache-size 1GB option cache-timeout 1 subvolumes readahead end-volume volume quickread type performance/quick-read option cache-timeout 1 option max-file-size 64kB subvolumes iocache end-volume volume statprefetch type performance/stat-prefetch subvolumes quickread end-volume #glusterfsd_keep=0 volume posix00 type storage/posix option directory /data/data00 end-volume volume locks00 type features/locks subvolumes posix00 end-volume volume brick00 type performance/io-threads option thread-count 8 subvolumes locks00 end-volume volume server type protocol/server option transport-type tcp option transport.socket.listen-port 6996 option transport.socket.nodelay on option auth.addr.brick00.allow * subvolumes brick00 end-volume ========================================================== Cluster name : Delldemo Arch : x86_64 SGE job submitted : Tue Dec 22 22:21:38 GMT 2009 Number of CPUS 8 Running Parallel IOZONE on ral03 Creating files in /data2/sccomp NTHREADS=4 Total data size = 48196 MBytes Running loop 1 of 2 Iozone: Performance Test of File I/O Version $Revision: 3.326 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root. Run began: Tue Dec 22 22:21:38 2009 Network distribution mode enabled. File size set to 12338176 KB Command line used: /opt/iozone/bin/iozone -+m hosts.556 -s 12049m -S 8192 -T -i 0 -i 1 -t 16 -F /data2/sccomp/BIG.0.comp03.streamline /data2/sccomp/BIG.1.comp03.streamline /data2/sccomp/BIG.2.comp03.streamline /data2/sccomp/BIG.3.comp03.streamline /data2/sccomp/BIG.0.ral02.streamline /data2/sccomp/BIG.1.ral02.streamline /data2/sccomp/BIG.2.ral02.streamline /data2/sccomp/BIG.3.ral02.streamline /data2/sccomp/BIG.0.ral03.streamline /data2/sccomp/BIG.1.ral03.streamline /data2/sccomp/BIG.2.ral03.streamline /data2/sccomp/BIG.3.ral03.streamline /data2/sccomp/BIG.0.ral04.streamline /data2/sccomp/BIG.1.ral04.streamline /data2/sccomp/BIG.2.ral04.streamline /data2/sccomp/BIG.3.ral04.streamline Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 8192 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 16 threads Each thread writes a 12338176 Kbyte file in 4 Kbyte records Test running: Children see throughput for 16 initial writers = 424003.62 KB/sec Min throughput per thread = 26480.14 KB/sec Max throughput per thread = 26517.04 KB/sec Avg throughput per thread = 26500.23 KB/sec Min xfer = 12321928.00 KB Test running: Children see throughput for 16 rewriters = 424109.61 KB/sec Min throughput per thread = 26483.30 KB/sec Max throughput per thread = 26530.66 KB/sec Avg throughput per thread = 26506.85 KB/sec Min xfer = 12316680.00 KB Test running: Children see throughput for 16 readers = 454358.62 KB/sec Min throughput per thread = 28298.30 KB/sec Max throughput per thread = 28592.02 KB/sec Avg throughput per thread = 28397.41 KB/sec Min xfer = 12211568.00 KB Test running: Children see throughput for 16 re-readers = 459262.06 KB/sec Min throughput per thread = 28600.55 KB/sec Max throughput per thread = 28892.20 KB/sec Avg throughput per thread = 28703.88 KB/sec Min xfer = 12219504.00 KB Test cleanup: iozone test complete. Running loop 2 of 2 Iozone: Performance Test of File I/O Version $Revision: 3.326 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root. Run began: Tue Dec 22 22:53:33 2009 Network distribution mode enabled. File size set to 12338176 KB Command line used: /opt/iozone/bin/iozone -+m hosts.556 -s 12049m -S 8192 -T -i 0 -i 1 -t 16 -F /data2/sccomp/BIG.0.comp03.streamline /data2/sccomp/BIG.1.comp03.streamline /data2/sccomp/BIG.2.comp03.streamline /data2/sccomp/BIG.3.comp03.streamline /data2/sccomp/BIG.0.ral02.streamline /data2/sccomp/BIG.1.ral02.streamline /data2/sccomp/BIG.2.ral02.streamline /data2/sccomp/BIG.3.ral02.streamline /data2/sccomp/BIG.0.ral03.streamline /data2/sccomp/BIG.1.ral03.streamline /data2/sccomp/BIG.2.ral03.streamline /data2/sccomp/BIG.3.ral03.streamline /data2/sccomp/BIG.0.ral04.streamline /data2/sccomp/BIG.1.ral04.streamline /data2/sccomp/BIG.2.ral04.streamline /data2/sccomp/BIG.3.ral04.streamline Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 8192 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 16 threads Each thread writes a 12338176 Kbyte file in 4 Kbyte records Test running: Children see throughput for 16 initial writers = 425851.12 KB/sec Min throughput per thread = 26593.95 KB/sec Max throughput per thread = 26634.84 KB/sec Avg throughput per thread = 26615.70 KB/sec Min xfer = 12319368.00 KB Test running: Children see throughput for 16 rewriters = 424954.77 KB/sec Min throughput per thread = 26459.38 KB/sec Max throughput per thread = 26656.61 KB/sec Avg throughput per thread = 26559.67 KB/sec Min xfer = 12247176.00 KB Test running: Children see throughput for 16 readers = 459433.33 KB/sec Min throughput per thread = 28449.77 KB/sec Max throughput per thread = 28964.50 KB/sec Avg throughput per thread = 28714.58 KB/sec Min xfer = 12119024.00 KB Test running: Children see throughput for 16 re-readers = 458413.46 KB/sec Min throughput per thread = 28457.53 KB/sec Max throughput per thread = 28831.23 KB/sec Avg throughput per thread = 28650.84 KB/sec Min xfer = 12178288.00 KB Test cleanup: iozone test complete. echo echo --------------- echo Job output ends echo ========================================================= echo SGE job: finished date = Tue Dec 22 23:25:20 GMT 2009 echo Total run time : 1 Hours 3 Minutes 42 Seconds echo Time in seconds: 3822 Seconds echo ========================================================= ========================================================== Cluster name : Delldemo Arch : x86_64 SGE job submitted : Tue Dec 22 23:25:30 GMT 2009 Number of CPUS 8 Running Parallel IOZONE on comp01 Creating files in /data2/sccomp NTHREADS=8 Total data size = 32240 MBytes Running loop 1 of 2 Iozone: Performance Test of File I/O Version $Revision: 3.326 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root. Run began: Tue Dec 22 23:25:30 2009 Network distribution mode enabled. File size set to 4126720 KB Command line used: /opt/iozone/bin/iozone -+m hosts.557 -s 4030m -S 512 -T -i 0 -i 1 -t 32 -F /data2/sccomp/BIG.0.comp00.streamline /data2/sccomp/BIG.1.comp00.streamline /data2/sccomp/BIG.2.comp00.streamline /data2/sccomp/BIG.3.comp00.streamline /data2/sccomp/BIG.4.comp00.streamline /data2/sccomp/BIG.5.comp00.streamline /data2/sccomp/BIG.6.comp00.streamline /data2/sccomp/BIG.7.comp00.streamline /data2/sccomp/BIG.0.comp01.streamline /data2/sccomp/BIG.1.comp01.streamline /data2/sccomp/BIG.2.comp01.streamline /data2/sccomp/BIG.3.comp01.streamline /data2/sccomp/BIG.4.comp01.streamline /data2/sccomp/BIG.5.comp01.streamline /data2/sccomp/BIG.6.comp01.streamline /data2/sccomp/BIG.7.comp01.streamline /data2/sccomp/BIG.0.comp02.streamline /data2/sccomp/BIG.1.comp02.streamline /data2/sccomp/BIG.2.comp02.streamline /data2/sccomp/BIG.3.comp02.streamline /data2/sccomp/BIG.4.comp02.streamline /data2/sccomp/BIG.5.comp02.streamline /data2/sccomp/BIG.6.comp02.streamline /data2/sccomp/BIG.7.comp02.streamline /data2/sccomp/BIG.0.ral01.streamline /data2/sccomp/BIG.1.ral01.streamlineCommand line too long to save completely. Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 512 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 32 threads Each thread writes a 4126720 Kbyte file in 4 Kbyte records Test running: Children see throughput for 32 initial writers = 431608.71 KB/sec Min throughput per thread = 13462.86 KB/sec Max throughput per thread = 13516.74 KB/sec Avg throughput per thread = 13487.77 KB/sec Min xfer = 4110728.00 KB Test running: Children see throughput for 32 rewriters = 433205.56 KB/sec Min throughput per thread = 13512.67 KB/sec Max throughput per thread = 13550.23 KB/sec Avg throughput per thread = 13537.67 KB/sec Min xfer = 4116360.00 KB Test running: Children see throughput for 32 readers = 458239.61 KB/sec Min throughput per thread = 13983.61 KB/sec Max throughput per thread = 14699.36 KB/sec Avg throughput per thread = 14319.99 KB/sec Min xfer = 3925872.00 KB Test running: Children see throughput for 32 re-readers = 457589.70 KB/sec Min throughput per thread = 13990.14 KB/sec Max throughput per thread = 14654.56 KB/sec Avg throughput per thread = 14299.68 KB/sec Min xfer = 3939696.00 KB Test cleanup: iozone test complete. Running loop 2 of 2 Iozone: Performance Test of File I/O Version $Revision: 3.326 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root. Run began: Tue Dec 22 23:48:31 2009 Network distribution mode enabled. File size set to 4126720 KB Command line used: /opt/iozone/bin/iozone -+m hosts.557 -s 4030m -S 512 -T -i 0 -i 1 -t 32 -F /data2/sccomp/BIG.0.comp00.streamline /data2/sccomp/BIG.1.comp00.streamline /data2/sccomp/BIG.2.comp00.streamline /data2/sccomp/BIG.3.comp00.streamline /data2/sccomp/BIG.4.comp00.streamline /data2/sccomp/BIG.5.comp00.streamline /data2/sccomp/BIG.6.comp00.streamline /data2/sccomp/BIG.7.comp00.streamline /data2/sccomp/BIG.0.comp01.streamline /data2/sccomp/BIG.1.comp01.streamline /data2/sccomp/BIG.2.comp01.streamline /data2/sccomp/BIG.3.comp01.streamline /data2/sccomp/BIG.4.comp01.streamline /data2/sccomp/BIG.5.comp01.streamline /data2/sccomp/BIG.6.comp01.streamline /data2/sccomp/BIG.7.comp01.streamline /data2/sccomp/BIG.0.comp02.streamline /data2/sccomp/BIG.1.comp02.streamline /data2/sccomp/BIG.2.comp02.streamline /data2/sccomp/BIG.3.comp02.streamline /data2/sccomp/BIG.4.comp02.streamline /data2/sccomp/BIG.5.comp02.streamline /data2/sccomp/BIG.6.comp02.streamline /data2/sccomp/BIG.7.comp02.streamline /data2/sccomp/BIG.0.ral01.streamline /data2/sccomp/BIG.1.ral01.streamlineCommand line too long to save completely. Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 512 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Throughput test with 32 threads Each thread writes a 4126720 Kbyte file in 4 Kbyte records Test running: Children see throughput for 32 initial writers = 432863.52 KB/sec Min throughput per thread = 13489.46 KB/sec Max throughput per thread = 13564.23 KB/sec Avg throughput per thread = 13526.99 KB/sec Min xfer = 4104456.00 KB Test running: Children see throughput for 32 rewriters = 433386.73 KB/sec Min throughput per thread = 13525.65 KB/sec Max throughput per thread = 13553.97 KB/sec Avg throughput per thread = 13543.34 KB/sec Min xfer = 4118280.00 KB Test running: Children see throughput for 32 readers = 458043.86 KB/sec Min throughput per thread = 13969.76 KB/sec Max throughput per thread = 14944.34 KB/sec Avg throughput per thread = 14313.87 KB/sec Min xfer = 3857648.00 KB Test running: _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users