This morning, I had a symptom of a I/O throughput problem in which dirty pages appeared to be taking a long time to write to disk. The system is a large x64 192GiB dell 810 server running 2.6.38.5 from kernel.org - the basic workload was data intensive - concurrent large NFS (with high metadata/low filesize), rsync/lftp (with low metadata/high file size) all working in a 200TiB XFS volume on a software MD raid0 on top of 7 software MD raid6, each w/18 drives. I had mounted the filesystem with inode64,largeio,logbufs=8,noatime. The specific symptom was that 'sync' hung, a dpkg command hung (presumably trying to issue fsync), and experimenting with "killall -STOP" or "kill -STOP" of the workload jobs didn't let the system drain I/O enough to finish the sync. I probably did not wait long enough, however. So here's what I did to diagnose: when all workloads were stopped, there was still low rate I/O from kflush->md array jobs. No CPU starvation, but the I/O rate was low - 5-30MiB/second (the array can readily do >1000MiB/second for big I/O). Mind you, one "md5sum --check" job was able to run at >200MiB/second without trouble - turn it off or on and the aggregate I/O load shoots right up or down along with it, so I'm fairly confident in the underlying physical arrays as well as XFS large data I/O. I did "echo 3 > /proc/sys/vm/drop_caches" repeatedly and noticed that according to top, the total amount of cached data would drop down rapidly (first time had the big drop), but still be stuck at around 8-10Gigabytes. While continuing to do this, I noticed finally that the cached data value was in fact dropping slowly (at the rate of 5-30MiB/second), and in fact finally dropped down to approximately 60Megabytes at which point the stuck dpkg command finished, and I was again able to issue sync commands that finished instantly. My guess is that I've done something to fill the buffer pool with slow to flush metadata - and prior to rebooting the machine a few minutes ago, I removed the largeio option in /etc/fstab. I can't say this is an XFS bug specifically, but more likely how I am using it - are there other tools I can use to better diagnose what is going on? I do know it will happen again, since we will have 5 of these machines running at very high rates soon. Also, any suggestions for better metadata or log management are very welcome. This particular machine is probably our worst, since it has the widest variation in offered file I/O load (tens of millions of small files, thousands of >1GB files). If this workload is pushing XFS too hard, I can deploy new hardware to split the workload across different filesystems. Thanks very much for any thoughts or suggestions, Paul Anderson _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs