On Wed, 2007-07-11 at 13:01 -0400, Wendy Cheng wrote: > Christopher Barry wrote: > > On Tue, 2007-07-10 at 22:23 -0400, Wendy Cheng wrote: > > > >> Pavel Stano wrote: > >> > >> > >>> and then run touch on node 1: > >>> serpico# touch /d/0/test > >>> > >>> and ls on node 2: > >>> dinorscio:~# time ls /d/0/ > >>> test > >>> > >>> > >>> > >>> > >> What have you expected from a cluster filesystem ? When you touch a file > >> on node 1, it is a "create" that requires at least 2 exclusive locks > >> (directory lock and the file lock itself, among many other things). On a > >> local filesystem such as ext3, disk activities are delayed due to > >> filesystem cache where "touch" writes the data into cache and "ls" reads > >> it from cache on the very same node - all memory operations. On cluster > >> filesystem, when you do an "ls" on node 2, node 2 needs to ask node 1 to > >> release the locks (few ping-pong messages between two nodes and lock > >> managers via network), the contents inside node 1's cache need to get > >> synced to the shared storage. After node 2 gets the locks, it has to > >> read contents from the disk. > >> > >> I hope the above explanation is clear. > >> > >> > >>> and last thing, i try gfs2, but same result > >>> > >>> > >>> > >>> > >>> > >> -- Wendy > >> > > > > This seems a little bit odd to me. I'm running a RH 7.3 cluster, > > pre-redhat Sistina GFS, lock_gulm, 1GB FC shared disk, and have been > > since ~2002. > > > > Here's the timing I get for the same basic test between two nodes: > > > > [root@sbc1 root]# cd /mnt/gfs/workspace/cbarry/ > > [root@sbc1 cbarry]# mkdir tst > > [root@sbc1 cbarry]# cd tst > > [root@sbc1 tst]# time touch testfile > > > > real 0m0.094s > > user 0m0.000s > > sys 0m0.000s > > [root@sbc1 tst]# time ls -la testfile > > -rw-r--r-- 1 root root 0 Jul 11 12:20 testfile > > > > real 0m0.122s > > user 0m0.010s > > sys 0m0.000s > > [root@sbc1 tst]# > > > > Then immediately from the other node: > > > > [root@sbc2 root]# cd /mnt/gfs/workspace/cbarry/ > > [root@sbc2 cbarry]# time ls -la tst > > total 12 > > drwxr-xr-x 2 root root 3864 Jul 11 12:20 . > > drwxr-xr-x 4 cbarry cbarry 3864 Jul 11 12:20 .. > > -rw-r--r-- 1 root root 0 Jul 11 12:20 testfile > > > > real 0m0.088s > > user 0m0.010s > > sys 0m0.000s > > [root@sbc2 cbarry]# > > > > > > Now, you cannot tell me 10 seconds is 'normal' for a clustered fs. That > > just does not fly. My guess is DLM is causing problems. > > > > > From previous post, we really can't tell since the network and disk > speeds are variables and unknown. However, look at your data: > > local "ls" is 0.122s > remote "ls" is 0.088s > > I bet the disk flushing happened during first "ls" (and different base > kernels treat their dirty data flush and IO scheduling differently). I > can't be convinced that DLM is an issue - unless the experiment has > collected enough sample that has its statistical significance. > > -- Wendy > > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster Where is all the time being spent? Certainly, it should not take 10 seconds. Let me see if I get the series of events correct here, and you can correct me where I'm wrong. Node1: touch is run, and asks (indirectly) for 2 exclusive write locks. dlm grants the locks. File is created into cache. locks are released (now?) local ls is run, and asks for read lock dlm grants lock. reads cache. returns results to screen lock is released Node2: remote ls is run, and asks for read lock ... what happens here? I think your saying dlm looks at the lock request, and says I can't give it to you, because the buffer has not been sync'd to disk yet. Does node2 wait, and retry asking for the lock after some time period, and do this in loop? Does the dlm on Node1 request the data be sync'd so that the requesting Node2 can access the data faster? If Pavel used dd to create a file, rather than touch, with a size larger than the buffer, and then used ls on Node2, would this show far better performance? Is the real issue the corner-case of a 0 byte file being created? Basically, I think you're saying that the kernel is keeping the 0 byte touched file in cache, and GFS and/or dlm cannot help with this situation. Is that correct? -- Regards, -C -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster