> -----Original Message----- > From: linux-cluster-bounces@xxxxxxxxxx [mailto:linux-cluster-bounces@xxxxxxxxxx] > On Behalf Of Peter Schobel > Sent: Wednesday, August 11, 2010 2:04 PM > To: linux clustering > Subject: How does caching work in GFS1? > > I am having an issue with a GFS1 cluster in some use cases. Mainly, > running du on a directory takes an unusually long time. I have the > filesystem mounted with noatime and nodiratime statfs_fast is turned > on. Running du on a 2G directory takes about 2 minutes and each > subsequent run took about the same amount of time. A stat() call over GFS is slow, period. How many files are in the 2GB directory? I would expect the time to be a linear function of the number of files, not the file sizes. The problem with du isn't that it's reading the directory (which is quite fast) but that it needs to stat() each file and directory it finds in order to compute a total size. We have seen similar performance with a GFS filesystem over which we regularly rsync entire directory trees. > I have been trying to tweak tunables such as glock_purge and > reclaim_limit but to no avail. All I found that would help me is increasing demote_secs. I believe that causes locks to be held for a longer period of time, so that the initial directory traversal is slow, but subsequent traversals are fast. If however you are running "du" on multiple cluster nodes at the same time, I don't think it'll help at all. > If I could get the same > speedup on the 30G directory as I'm getting on the 2G directory I > would be very happy and so would the users on the cluster. Out of sheer curiosity do your users need to literally run "du" commands routinely, or is that just a simplification of the actual use case? Depending on what your application does, there may be strategies in software that would optimize your performance on GFS. -Jeff -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster