David Teigland <teigland@xxxxxxxxxx> writes: >>>> But looks like nodeA feels obliged to communicate its locking >>>> process around the cluster. >>> >>> I'm not sure what you mean here. To see the amount of dlm locking traffic >>> on the network, look at port 21064. There should be very little in the >>> test above... and the dlm locking that you do see should mostly be related >>> to file i/o, not flocks. >> >> There was much traffic on port 21064. Possibly related to file I/O >> and not flocks, I can't tell. But that's agrees with my speculation, >> that it's not the explicit [pf]locks that take much time, but >> something else. > > Could you comment the fcntl/flock calls out of the application entirely > and try it? Let's see. A typical test run looks like this (first with fcntl locking; tcpdump slows down the first iteration from about 6 s): filecount=500 iteration=0 elapsed time=20.196318 s iteration=1 elapsed time=0.323969 s iteration=2 elapsed time=0.319929 s iteration=3 elapsed time=0.361738 s iteration=4 elapsed time=0.399365 s total elapsed time=21.601319 s During the first (slow) iteration, there's much traffic on port 21064. During the next (fast) iterations there's no traffic at all on that port. If I rerun the test immediately, there's still no traffic. 5 minutes later, without any action on my part, there's a couple of packets again, then 20 s later a bigger bunch (around 30). After this, the first iteration generates much traffic again, GOTO 10. If I use flock instead, the beginning is similar, but after about 10 s from the finish of the test, some small traffic appears by itself, and if I rerun the test after this, it generates traffic again, although much less than after 5 minutes. The traffic generated 5 minutes after the test run consists of a couple of packets followed by a much bigger bunch 5 s later. If I don't use any locking at all, then the situation is the same as with fcntl locking, but the "automatic" traffic consist of a small burst (couple of packets) 4 min 51 s after the finish, then about 30 packets 25 s later. Does it tell you anything? The timings are perhaps somewhat off because of the 20 s runtime. If you can make some sense out of this, I'd be glad to hear it. Also, I'd like to tweak the 5 minutes timeout, where does it come from? Is it settable by sysfs or gfs_tool? -- Thanks, Feri. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster