Hello, > On Tue, Oct 19, 2010 at 02:42:47PM +1100, npiggin@xxxxxxxxx wrote: > > Per-zone LRUs and shrinkers for inode cache. > > Regardless of whether this is the right way to scale or not, I don't > like the fact that this moves the cache LRUs into the memory > management structures, and expands the use of MM specific structures > throughout the code. It ties the cache implementation to the current > VM implementation. That, IMO, goes against all the principle of > modularisation at the source code level, and it means we have to tie > all shrinker implemenations to the current internal implementation > of the VM. I don't think that is wise thing to do because of the > dependencies and impedance mismatches it introduces. > > As an example: XFS inodes to be reclaimed are simply tagged in a > radix tree so the shrinker can reclaim inodes in optimal IO order > rather strict LRU order. It simply does not match a zone-based > shrinker implementation in any way, shape or form, nor does it's > inherent parallelism match that of the way shrinkers are called. > > Any change in shrinker infrastructure needs to be able to handle > these sorts of impedance mismatches between the VM and the cache > subsystem. The current API doesn't handle this very well, either, > so it's something that we need to fix so that scalability is easy > for everyone. > > Anyway, my main point is that tying the LRU and shrinker scaling to > the implementation of the VM is a one-off solution that doesn't work > for generic infrastructure. Other subsystems need the same > large-machine scaling treatment, and there's no way we should be > tying them all into the struct zone. It needs further abstraction. I'm not sure what data structure is best. I can only say current zone unawareness slab shrinker might makes following sad scenario. o DMA zone shortage invoke and plenty icache in NORMAL zone dropping o NUMA aware system enable zone_reclaim_mode, but shrink_slab() still drop unrelated zone's icache both makes performance degression. In other words, Linux does not have flat memory model. so, I don't think Nick's basic concept is wrong. It's straight forward enhancement. but if it don't fit current shrinkers, I'd like to discuss how to make better data structure. and I have dump question (sorry, I don't know xfs at all). current xfs_mount is below. typedef struct xfs_mount { ... struct shrinker m_inode_shrink; /* inode reclaim shrinker */ } xfs_mount_t; Do you mean xfs can't convert shrinker to shrinker[ZONES]? If so, why? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html