On Fri, Oct 11, 2019 at 11:29:45AM -0400, Brian Foster wrote: > On Fri, Oct 11, 2019 at 03:18:25AM -0700, Christoph Hellwig wrote: > > On Wed, Oct 09, 2019 at 02:21:14PM +1100, Dave Chinner wrote: > > > Factor the common AIL deletion code that does all the wakeups into a > > > helper so we only have one copy of this somewhat tricky code to > > > interface with all the wakeups necessary when the LSN of the log > > > tail changes. > > > > > > xfs_ail_push_sync() is temporary infrastructure to facilitate > > > non-blocking, IO-less inode reclaim throttling that allows further > > > structural changes to be made. Once those structural changes are > > > made, the need for this function goes away and it is removed, > > > leaving us with only the xfs_ail_update_finish() factoring when this > > > is all done. > > > > The xfs_ail_update_finish work here is in an earlier patch, so the > > changelog will need some updates. > > > > > + spin_lock(&ailp->ail_lock); > > > + while ((lip = xfs_ail_min(ailp)) != NULL) { > > > + prepare_to_wait(&ailp->ail_push, &wait, TASK_UNINTERRUPTIBLE); > > > + if (XFS_FORCED_SHUTDOWN(ailp->ail_mount) || > > > + XFS_LSN_CMP(threshold_lsn, lip->li_lsn) <= 0) > > Wasn't this supposed to change to < 0? The rfc series had that logic, > but it changed from <= to < later in the wrong patch. I probably forgot because this code gets removed at the end of the series. Hence I haven't cared about exact correctness of neatness as it's just temporary scaffolding to keep stuff from breaking horribly as the changeover to non-blocking algorithms is done. It works well enough that I can't break it as it stands - I've tested each patch individually with both load and fstests, and so this code as it stands doesn't introduce any bisect landmines - it prevents a bunch of problems in OOM conditions by retaining the blocking behaviour of reclaim until we no longer need it... > > > + break; > > > + /* XXX: cmpxchg? */ > > > + while (XFS_LSN_CMP(threshold_lsn, ailp->ail_target) > 0) > > > + xfs_trans_ail_copy_lsn(ailp, &ailp->ail_target, &threshold_lsn); > > > > This code looks broken on 32-bit given that xfs_trans_ail_copy_lsn takes > > the ail_lock there. Just replacing the xfs_trans_ail_copy_lsn call with > > a direct assignment would fix that, no need for cmpxchg either as far > > as I can tell (and it would fix that too long line as well). Oh, right. I'll fix that. > > still looks odd, I think this should simply be an if. > > > > > + wake_up_process(ailp->ail_task); > > > + spin_unlock(&ailp->ail_lock); > > > > xfsaild will take ail_lock pretty quickly. I think we should drop > > the lock before waking it. > > Can't we replace this whole thing with something that repurposes > xfs_ail_push_all_sync()? That only requires some tweaks to the existing > function and the new _push_all_sync() wrapper ends up looking something > like: > > while ((threshold_lsn = xfs_ail_max_lsn(ailp)) != 0) > xfs_ail_push_sync(ailp, threshold_lsn); > > There's an extra lock cycle, but that's still only on tail updates. That > doesn't seem unreasonable to me for the usage of _push_all_sync(). The whole thing goes away, so there is zero point in trying to optimise or perfect this code. It's temporary code, treat it as such. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx