On Mon, Apr 23, 2007 at 04:17:18PM -0500, David Teigland wrote: > > Also, what's that new infrastructure? Do you mean GFS2? I read it > > was not production quality yet, so I didn't mean to try it. But again > > you may have got something else in your head... > > GFS1 and GFS2 both run on the new openais-based cluster infrastructure. > (in the cluster-2.00.00 release, and the RHEL5 and HEAD cvs branches). I've attached a little flock/plock performance test that emulates what you're doing; could you run it on your cluster and send the results? The only cluster I have for testing right now is based on the new code which includes the new dlm (would effect flock performance) and the new gfs_controld plock implementation. Here are the numbers I got, I had two nodes mounting the fs, and one node running the test. gfs_controld running with default settings: # fplockperf flock: filecount=100 iteration=0 elapsed time=0.098 s flock: filecount=100 iteration=1 elapsed time=0.007 s flock: filecount=100 iteration=2 elapsed time=0.007 s flock: filecount=100 iteration=3 elapsed time=0.008 s flock: filecount=100 iteration=4 elapsed time=0.007 s total elapsed time=0.129 s flock: filecount=500 iteration=0 elapsed time=0.483 s flock: filecount=500 iteration=1 elapsed time=0.037 s flock: filecount=500 iteration=2 elapsed time=0.039 s flock: filecount=500 iteration=3 elapsed time=0.037 s flock: filecount=500 iteration=4 elapsed time=0.037 s total elapsed time=0.634 s flock: filecount=1000 iteration=0 elapsed time=0.523 s flock: filecount=1000 iteration=1 elapsed time=0.077 s flock: filecount=1000 iteration=2 elapsed time=0.076 s flock: filecount=1000 iteration=3 elapsed time=0.076 s flock: filecount=1000 iteration=4 elapsed time=0.076 s total elapsed time=0.830 s flock: filecount=2000 iteration=0 elapsed time=1.064 s flock: filecount=2000 iteration=1 elapsed time=0.151 s flock: filecount=2000 iteration=2 elapsed time=0.151 s flock: filecount=2000 iteration=3 elapsed time=0.146 s flock: filecount=2000 iteration=4 elapsed time=0.147 s total elapsed time=1.661 s flock: filecount=5000 iteration=0 elapsed time=3.505 s flock: filecount=5000 iteration=1 elapsed time=0.405 s flock: filecount=5000 iteration=2 elapsed time=0.407 s flock: filecount=5000 iteration=3 elapsed time=0.405 s flock: filecount=5000 iteration=4 elapsed time=0.405 s total elapsed time=5.128 s plock: filecount=100 iteration=0 elapsed time=0.621 s plock: filecount=100 iteration=1 elapsed time=2.108 s plock: filecount=100 iteration=2 elapsed time=2.099 s plock: filecount=100 iteration=3 elapsed time=2.099 s plock: filecount=100 iteration=4 elapsed time=2.011 s total elapsed time=8.939 s plock: filecount=500 iteration=0 elapsed time=10.519 s plock: filecount=500 iteration=1 elapsed time=10.350 s plock: filecount=500 iteration=2 elapsed time=10.457 s plock: filecount=500 iteration=3 elapsed time=10.178 s plock: filecount=500 iteration=4 elapsed time=10.164 s total elapsed time=51.669 s plock: filecount=1000 iteration=0 elapsed time=20.533 s plock: filecount=1000 iteration=1 elapsed time=20.379 s plock: filecount=1000 iteration=2 elapsed time=20.656 s plock: filecount=1000 iteration=3 elapsed time=20.494 s plock: filecount=1000 iteration=4 elapsed time=20.810 s total elapsed time=102.872 s plock: filecount=2000 iteration=0 elapsed time=41.037 s plock: filecount=2000 iteration=1 elapsed time=40.951 s plock: filecount=2000 iteration=2 elapsed time=41.034 s plock: filecount=2000 iteration=3 elapsed time=41.077 s plock: filecount=2000 iteration=4 elapsed time=41.052 s total elapsed time=205.152 s plock: filecount=5000 iteration=0 elapsed time=103.086 s plock: filecount=5000 iteration=1 elapsed time=102.845 s plock: filecount=5000 iteration=2 elapsed time=102.391 s plock: filecount=5000 iteration=3 elapsed time=103.586 s plock: filecount=5000 iteration=4 elapsed time=102.104 s total elapsed time=514.012 s The new dlm caches resource lookups which is probably responsible for the big difference between the first flock iteration and the following four. These cached rsbs are dropped after about 10 sec, so I've added a delay between each filecount setting so the caching effects are separated. We see the expected slight rise in the initial iteration time for flocks after adding the delay. (In the new infrastructure, flocks are still implemented with dlm locks, but plocks aren't, so the dlm has no effect on plock performance). # fplockperf 20 flock: filecount=100 iteration=0 elapsed time=0.100 s flock: filecount=100 iteration=1 elapsed time=0.008 s flock: filecount=100 iteration=2 elapsed time=0.008 s flock: filecount=100 iteration=3 elapsed time=0.007 s flock: filecount=100 iteration=4 elapsed time=0.007 s total elapsed time=0.131 s 20 sec delay... flock: filecount=500 iteration=0 elapsed time=0.559 s flock: filecount=500 iteration=1 elapsed time=0.038 s flock: filecount=500 iteration=2 elapsed time=0.037 s flock: filecount=500 iteration=3 elapsed time=0.036 s flock: filecount=500 iteration=4 elapsed time=0.036 s total elapsed time=0.707 s 20 sec delay... flock: filecount=1000 iteration=0 elapsed time=1.057 s flock: filecount=1000 iteration=1 elapsed time=0.076 s flock: filecount=1000 iteration=2 elapsed time=0.076 s flock: filecount=1000 iteration=3 elapsed time=0.078 s flock: filecount=1000 iteration=4 elapsed time=0.072 s total elapsed time=1.359 s 20 sec delay... flock: filecount=2000 iteration=0 elapsed time=2.042 s flock: filecount=2000 iteration=1 elapsed time=0.156 s flock: filecount=2000 iteration=2 elapsed time=0.154 s flock: filecount=2000 iteration=3 elapsed time=0.155 s flock: filecount=2000 iteration=4 elapsed time=0.155 s total elapsed time=2.663 s 20 sec delay... flock: filecount=5000 iteration=0 elapsed time=5.169 s flock: filecount=5000 iteration=1 elapsed time=0.397 s flock: filecount=5000 iteration=2 elapsed time=0.399 s flock: filecount=5000 iteration=3 elapsed time=0.392 s flock: filecount=5000 iteration=4 elapsed time=0.386 s total elapsed time=6.743 s By default, gfs_controld limits the rate at which plocks can be sent/acquired so that the network isn't accidentally flooded. We can disable this rate limiting to speed up the plocks with 'gfs_controld -l0' which I've done next. # fplockperf 20 flock: filecount=100 iteration=0 elapsed time=0.094 s flock: filecount=100 iteration=1 elapsed time=0.007 s flock: filecount=100 iteration=2 elapsed time=0.007 s flock: filecount=100 iteration=3 elapsed time=0.007 s flock: filecount=100 iteration=4 elapsed time=0.007 s total elapsed time=0.125 s 20 sec delay... flock: filecount=500 iteration=0 elapsed time=0.551 s flock: filecount=500 iteration=1 elapsed time=0.036 s flock: filecount=500 iteration=2 elapsed time=0.035 s flock: filecount=500 iteration=3 elapsed time=0.036 s flock: filecount=500 iteration=4 elapsed time=0.035 s total elapsed time=0.693 s 20 sec delay... flock: filecount=1000 iteration=0 elapsed time=1.031 s flock: filecount=1000 iteration=1 elapsed time=0.075 s flock: filecount=1000 iteration=2 elapsed time=0.074 s flock: filecount=1000 iteration=3 elapsed time=0.074 s flock: filecount=1000 iteration=4 elapsed time=0.075 s total elapsed time=1.329 s 20 sec delay... flock: filecount=2000 iteration=0 elapsed time=1.985 s flock: filecount=2000 iteration=1 elapsed time=0.146 s flock: filecount=2000 iteration=2 elapsed time=0.156 s flock: filecount=2000 iteration=3 elapsed time=0.145 s flock: filecount=2000 iteration=4 elapsed time=0.150 s total elapsed time=2.583 s 20 sec delay... flock: filecount=5000 iteration=0 elapsed time=5.064 s flock: filecount=5000 iteration=1 elapsed time=0.378 s flock: filecount=5000 iteration=2 elapsed time=0.378 s flock: filecount=5000 iteration=3 elapsed time=0.378 s flock: filecount=5000 iteration=4 elapsed time=0.379 s total elapsed time=6.577 s 20 sec delay... plock: filecount=100 iteration=0 elapsed time=0.604 s plock: filecount=100 iteration=1 elapsed time=0.601 s plock: filecount=100 iteration=2 elapsed time=0.599 s plock: filecount=100 iteration=3 elapsed time=0.600 s plock: filecount=100 iteration=4 elapsed time=0.600 s total elapsed time=3.005 s 20 sec delay... plock: filecount=500 iteration=0 elapsed time=3.010 s plock: filecount=500 iteration=1 elapsed time=3.008 s plock: filecount=500 iteration=2 elapsed time=2.993 s plock: filecount=500 iteration=3 elapsed time=3.013 s plock: filecount=500 iteration=4 elapsed time=3.006 s total elapsed time=15.032 s 20 sec delay... plock: filecount=1000 iteration=0 elapsed time=6.030 s plock: filecount=1000 iteration=1 elapsed time=6.075 s plock: filecount=1000 iteration=2 elapsed time=6.023 s plock: filecount=1000 iteration=3 elapsed time=6.074 s plock: filecount=1000 iteration=4 elapsed time=6.088 s total elapsed time=30.291 s 20 sec delay... plock: filecount=2000 iteration=0 elapsed time=12.064 s plock: filecount=2000 iteration=1 elapsed time=12.028 s plock: filecount=2000 iteration=2 elapsed time=12.056 s plock: filecount=2000 iteration=3 elapsed time=12.055 s plock: filecount=2000 iteration=4 elapsed time=12.009 s total elapsed time=60.212 s 20 sec delay... plock: filecount=5000 iteration=0 elapsed time=30.128 s plock: filecount=5000 iteration=1 elapsed time=30.126 s plock: filecount=5000 iteration=2 elapsed time=30.102 s plock: filecount=5000 iteration=3 elapsed time=30.129 s plock: filecount=5000 iteration=4 elapsed time=30.400 s total elapsed time=150.887 s -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster