On Wed, Mar 07, 2012 at 03:50:28PM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > When we read inodes via bulkstat, we generally only read them once > and then throw them away - they never get used again. If we retain > them in cache, then it simply causes the working set of inodes and > other cached items to be reclaimed just so the inode cache can grow. > > Avoid this problem by marking inodes read by bulkstat as not to be > cached and check this flag in .drop_inode to determine whether the > inode should be added to the VFS LRU or not. If the inode lookup > hits an already cached inode, then don't set the flag. If the inode > lookup hits an inode marked with no cache flag, remove the flag and > allow it to be cached once the current reference goes away. > > Inodes marked as not cached will get cleaned up by the background > inode reclaim or via memory pressure, so they will still generate > some short term cache pressure. They will, however, be reclaimed > much sooner and in preference to cache hot inodes. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Looks good. Reviewed-by: Ben Myers <bpm@xxxxxxx> > --- > fs/xfs/xfs_iget.c | 8 ++++++-- > fs/xfs/xfs_inode.h | 4 +++- > fs/xfs/xfs_itable.c | 3 ++- > fs/xfs/xfs_super.c | 17 +++++++++++++++++ > 4 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c > index 93fc1dc..20ddb1e 100644 > --- a/fs/xfs/xfs_iget.c > +++ b/fs/xfs/xfs_iget.c > @@ -290,7 +290,7 @@ xfs_iget_cache_hit( > if (lock_flags != 0) > xfs_ilock(ip, lock_flags); > > - xfs_iflags_clear(ip, XFS_ISTALE); > + xfs_iflags_clear(ip, XFS_ISTALE | XFS_IDONTCACHE); > XFS_STATS_INC(xs_ig_found); > > return 0; > @@ -315,6 +315,7 @@ xfs_iget_cache_miss( > struct xfs_inode *ip; > int error; > xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ino); > + int iflags; > > ip = xfs_inode_alloc(mp, ino); > if (!ip) > @@ -359,8 +360,11 @@ xfs_iget_cache_miss( > * memory barrier that ensures this detection works correctly at lookup > * time. > */ > + iflags = XFS_INEW; > + if (flags & XFS_IGET_DONTCACHE) > + iflags |= XFS_IDONTCACHE; > ip->i_udquot = ip->i_gdquot = NULL; > - xfs_iflags_set(ip, XFS_INEW); > + xfs_iflags_set(ip, iflags); > > /* insert the new inode */ > spin_lock(&pag->pag_ici_lock); > diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h > index eda4937..096b887 100644 > --- a/fs/xfs/xfs_inode.h > +++ b/fs/xfs/xfs_inode.h > @@ -374,10 +374,11 @@ xfs_set_projid(struct xfs_inode *ip, > #define XFS_IFLOCK (1 << __XFS_IFLOCK_BIT) > #define __XFS_IPINNED_BIT 8 /* wakeup key for zero pin count */ > #define XFS_IPINNED (1 << __XFS_IPINNED_BIT) > +#define XFS_IDONTCACHE (1 << 9) /* don't cache the inode long term */ > > /* > * Per-lifetime flags need to be reset when re-using a reclaimable inode during > - * inode lookup. Thi prevents unintended behaviour on the new inode from > + * inode lookup. This prevents unintended behaviour on the new inode from > * ocurring. > */ > #define XFS_IRECLAIM_RESET_FLAGS \ > @@ -544,6 +545,7 @@ do { \ > */ > #define XFS_IGET_CREATE 0x1 > #define XFS_IGET_UNTRUSTED 0x2 > +#define XFS_IGET_DONTCACHE 0x4 > > int xfs_inotobp(struct xfs_mount *, struct xfs_trans *, > xfs_ino_t, struct xfs_dinode **, > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c > index 751e94f..b832c58 100644 > --- a/fs/xfs/xfs_itable.c > +++ b/fs/xfs/xfs_itable.c > @@ -76,7 +76,8 @@ xfs_bulkstat_one_int( > return XFS_ERROR(ENOMEM); > > error = xfs_iget(mp, NULL, ino, > - XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip); > + (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED), > + XFS_ILOCK_SHARED, &ip); > if (error) { > *stat = BULKSTAT_RV_NOTHING; > goto out_free; > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index b1df512..c162765 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -953,6 +953,22 @@ xfs_fs_evict_inode( > xfs_inactive(ip); > } > > +/* > + * We do an unlocked check for XFS_IDONTCACHE here because we are already > + * serialised against cache hits here via the inode->i_lock and igrab() in > + * xfs_iget_cache_hit(). Hence a lookup that might clear this flag will not be > + * racing with us, and it avoids needing to grab a spinlock here for every inode > + * we drop the final reference on. > + */ I'll try to put this in my own words, just in case it is mystifying for anyone else. ;) In this case it is ok to do check of ip->i_flags without holding inode->i_flags_lock because... we have exclusion from xfs_iget_cache_hit as follows: The 'dropper' would have taken inode->i_lock when the inode's count went to zero, and if the XFS_IDONTCARE flag is set, dropper will return 1 to iput_final which will result in iput_final skipping the inode lru and setting I_FREEING immediately, before droppig inode->i_lock and evicting the inode. A 'cache hitter' must call igrab in order to get a reference on the inode. igrab takes the inode->i_lock, and if I_FREEING is set, it returns NULL, then xfs_iget_cache_hit returns EAGAIN, and is restarted. So... any 'cache hitter' who could possibly clear the XFS_IDONTCACHE flag subsequent to 'dropper' checking it would always be unable to get a reference due to I_FREEING having been set by the dropper. I appreciate that you added the comment. Regards, Ben > +STATIC int > +xfs_fs_drop_inode( > + struct inode *inode) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + > + return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE); > +} > + > STATIC void > xfs_free_fsname( > struct xfs_mount *mp) > @@ -1431,6 +1447,7 @@ static const struct super_operations xfs_super_operations = { > .dirty_inode = xfs_fs_dirty_inode, > .write_inode = xfs_fs_write_inode, > .evict_inode = xfs_fs_evict_inode, > + .drop_inode = xfs_fs_drop_inode, > .put_super = xfs_fs_put_super, > .sync_fs = xfs_fs_sync_fs, > .freeze_fs = xfs_fs_freeze, > -- > 1.7.9 > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs