On Sat, May 21, 2011 at 10:45:44AM +1000, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > Longer meaning practically infinitely :) > > No, longer meaning the in-memory lifecycle of the inode. That makes no sense - if I have twice the memory I suddenly have half (or some other factor) free diskspace. The lifetime of the preallocated area should be tied to something sensible, really - all that xfs has now is a broken heuristic that ties the wrong statistic to the extra space allocated. Or in other words, tieing the amount of preallocations to the amount of free ram (for the inode) is not a sensible heuristic. > log file writing - append only workloads - is one where the dynamic > speculative preallocation can make a significant difference. Thats absolutely fantastic, as that will apply to a large range of files that are problematic (while xfs performs really well in most cases). > > However, I would suggest that whatever heuristic 2.6.38 uses is deeply > > broken at the momment, > > One bug report two months after general availability != deeply > broken. That makes no sense - I only found out about this broken behaviour because I specified a large allocsize manually, which is rare. However, the behaviour happens even without that. but might not be immediately noticable (how would you find out if you lost a few gigabytes of disk space unless the disk runs full? most people would have no clue where to look for). Just because the breakage is not obviously visible doesn't mean it's not deeply broken. Also, I just looked more thoroughly through the list - the problem has been reported before, but was basically ignored, so you are wrong in that there is only one report. > While using a large allocsize mount option, which is relatively > rare. Basically, you've told XFS to optimise allocation for large > files and then are running workloads with lots of small files. The allocsize isn't "optimise for large files", it's to reduce fragmentation. 64MB is _hardly_ a big size for logfiles. Note also that the breakage occurs with smaller allocsize values as well., it's just less obvious All you do right now is make up fantasy reasons on why to ignore this report - the problem applies to any allocsize, and, unless xfs uses a different heuristic for dynamic preallocation, even without the mount option. > It's not surprise that there are issues, and you don't need the changes > in 2.6.38 to get bitten by this problem.... Really? I do know (by measuring it) that older kernels do not have this problem, and you basically said the same thing, namely that there was a behaviour change. If your goal is to argue for yourself that the breakage has to stay, thats fine, but don't expect users (like me) to follow your illogical train of thought. > > and there is really no need to cache this preallocation for > > files that have been closed 8 hours ago and never touched since then. > > If the preallocation was the size of the dynamic behaviour, you > wouldn't have even noticed this. Maybe, it certainly is a lot less noticable. But the new xfs behaviour basically means you have less space (potentially a lot less) on your disk when you have more memory, and that disk space is lost indefinitely just because I have some free ram. This is simply not a sensible heuristic - more ram must not mean that potentialy large amounts of diskspace are lost forever (if you have enough ram). > So really what you are saying is that it is excessive for your current > configuration and workload. No, what I am saying is that the heuristic is simply buggy - it ties one value (available ram for cache) to a completely unrelated one (amount of free space used for preallocation). It also doesn't only happen in my workload only. > better for allocsize filesystems. However, I'm not going to start to > add lots of workload-dependent tweaks to this code - the default > behaviour is much better and in most cases removes the problems that > led to using allocsize in the first place. So removing allocsize > from your config is, IMO, the correct fix, not tweaking heuristics in > the code... I am fine with not using allocsize if the fragmentation problems in xfs (for append-only cases) has been improved. But you aid the heuristic applies regardless of whether allocsize was specified or not. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@xxxxxxxxxx -=====/_/_//_/\_,_/ /_/\_\ _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs