On Wed, Nov 06, 2019 at 12:21:04PM -0500, Brian Foster wrote: > On Fri, Nov 01, 2019 at 10:46:14AM +1100, Dave Chinner wrote: > > - struct xfs_inode *ip, > > - int flags, > > - xfs_lsn_t *lsn) > > +enum lru_status > > +xfs_inode_reclaim_isolate( > > + struct list_head *item, > > + struct list_lru_one *lru, > > + spinlock_t *lru_lock, > > Did we ever establish whether we should cycle the lru_lock during long > running scans? I'm still evaluating this. In theory, because it's non-blocking, the lock hold time isn't huge, but OTOH I think the hold time is causing lock contention problems on unlink workloads. I've found a bunch of perf/blocking problems in the last few days, and each one of them I sort out puts more pressure on the lru list lock on unlinks. > > - /* > > - * Do unlocked checks to see if the inode already is being flushed or in > > - * reclaim to avoid lock traffic. If the inode is not clean, return the > > - * position in the AIL for the caller to push to. > > - */ > > - if (!xfs_inode_clean(ip)) { > > - *lsn = ip->i_itemp->ili_item.li_lsn; > > - return false; > > + if (!__xfs_iflock_nowait(ip)) { > > + lsn = ip->i_itemp->ili_item.li_lsn; > > This looks like a potential crash vector if we ever got here with a > clean inode. I'm not sure we can ever fail a flush lock attempt on a clean inode. But I'll rework the lsn grabbing, I think. > > + ra->dirty_skipped++; > > + goto out_unlock_inode; > > } > > > > - if (__xfs_iflags_test(ip, XFS_IFLOCK | XFS_IRECLAIM)) > > - return false; > > + if (XFS_FORCED_SHUTDOWN(ip->i_mount)) > > + goto reclaim; > > > > /* > > - * The radix tree lock here protects a thread in xfs_iget from racing > > - * with us starting reclaim on the inode. Once we have the > > - * XFS_IRECLAIM flag set it will not touch us. > > - * > > - * Due to RCU lookup, we may find inodes that have been freed and only > > - * have XFS_IRECLAIM set. Indeed, we may see reallocated inodes that > > - * aren't candidates for reclaim at all, so we must check the > > - * XFS_IRECLAIMABLE is set first before proceeding to reclaim. > > + * Now the inode is locked, we can actually determine if it is dirty > > + * without racing with anything. > > */ > > - spin_lock(&ip->i_flags_lock); > > - if (!__xfs_iflags_test(ip, XFS_IRECLAIMABLE) || > > - __xfs_iflags_test(ip, XFS_IRECLAIM)) { > > - /* not a reclaim candidate. */ > > - spin_unlock(&ip->i_flags_lock); > > - return false; > > + ret = LRU_ROTATE; > > + if (xfs_ipincount(ip)) { > > + ra->dirty_skipped++; > > Hmm.. didn't we have an LSN check here? Yes, but if the inode was not in the AIL, it would crash, so I removed it :P > Altogether, I think the logic in this function would be a lot more > simple if we had something like the following: > > ... > /* ret == LRU_SKIP */ > if (!xfs_inode_clean(ip)) { > ret = LRU_ROTATE; > lsn = ip->i_itemp->ili_item.li_lsn; > ra->dirty_skipped++; > } > if (lsn && XFS_LSN_CMP(lsn, ra->lowest_lsn) < 0) > ra->lowest_lsn = lsn; > return ret; > > ... as the non-reclaim exit path. Yeah, that was what I was thinking when you pointed out the iflock_nowait issue above. I'll end up with something like this, I think.... > > void > > xfs_reclaim_all_inodes( > > struct xfs_mount *mp) > > { > ... > > + while (list_lru_count(&mp->m_inode_lru)) { > > It seems unnecessary to call this twice per-iter: > > while ((to_free = list_lru_count(&mp->m_inode_lru))) { > ... > } *nod*. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx