On Fri, Apr 23, 2010 at 02:38:01AM +1000, Nick Piggin wrote: > On Thu, Apr 22, 2010 at 12:32:11PM -0400, Christoph Hellwig wrote: > > On Wed, Apr 21, 2010 at 06:40:04PM +1000, Nick Piggin wrote: > > > I'm saying that dynamic registration is no good, if we don't have a > > > way to order the shrinkers. > > > > We can happily throw in a priority field into the shrinker structure, > > but at this stage in the release process I'd rather have an as simple > > as possible fix for the regression. And just adding the context pointer > > which is a no-op for all existing shrinkers fits that scheme very well. > > > > If it makes you happier I can queue up a patch to add the priorities > > for 2.6.35. I think figuring out any meaningful priorities will be > > much harder than that, though. > > I don't understand, it should be implemented like just all the other > shrinkers AFAIKS. Like the dcache one that has to shrink multiple > superblocks. There is absolutely no requirement for this API change > to implement it in XFS. Well, I've gone and done this global shrinker because I need a fix for the problem before .34 releases, not because I like it. Now my problem is that the accepted method of using global shrinkers (i.e. split nr_to-scan into portions based on per-fs usage) is causing a regression compared to not having a shrinker at all. The context based shrinker did not cause this regression, either. The regression is oom-killer panics with "no killable tasks" - it kills my 1GB RAM VM dead. Without a shrinker or with the context based shrinkers I will see one or two dd processes getting OOM-killed maybe once every 10 or so runs on this VM, but the machine continues to stay up. The global shrinker is turning this into a panic, and it is happening about twice as often. To fix this I've had to remove all the code that proportions the reclaim across all the XFS filesystems in the system. Basically it now walks from the first filesystem in the list to the last every time and effectively it only reclaims from the first filesystem it finds with reclaimable inodes. This is exactly the behaviour the context based shrinkers give me, without the need for adding global lists, additional locking and traverses. Also, context based shrinkers won't re-traverse all the filesystems, avoiding the potential for starving some filesystems of shrinker based reclaim if filesystems earlier in the list are putting more inodes into reclaim concurrently. Given that this behaviour matches pretty closely to the reasons I've already given for preferring context based per-fs shrinkers than a global shrinker and list, can we please move forward with this API change, Nick? As it is, I'm going to cross my fingers and ship this global shrinker because of time limitations, but I certainly hoping that for .35 we can move to context based shrinking.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html