On Fri, Apr 22, 2022 at 12:22:28PM -0400, Brian Foster wrote: > On Fri, Apr 22, 2022 at 09:00:21AM -0700, Darrick J. Wong wrote: > > On Fri, Apr 22, 2022 at 10:12:26AM -0400, Brian Foster wrote: > > > The filestream AG selection loop uses pagf data to aid in AG > > > selection, which depends on pagf initialization. If the in-core > > > structure is not initialized, the caller invokes the AGF read path > > > to do so and carries on. If another task enters the loop and finds > > > a pagf init already in progress, the AGF read returns -EAGAIN and > > > the task continues the loop. This does not increment the current ag > > > index, however, which means the task spins on the current AGF buffer > > > until unlocked. > > > > > > If the AGF read I/O submitted by the initial task happens to be > > > delayed for whatever reason, this results in soft lockup warnings > > > > Is there a specific 'whatever reason' going on here? > > > > Presumably.. given this seems to reproduce reliably or not at all in > certain environments/configs, my suspicion was that either the timing of > the test changes enough such that some other task involved with the test > is able to load the bdev, or otherwise timing changes just enough to > trigger the pagf_init race and the subsequent spinning is what > exacerbates the delay (i.e. burning cpu and subsequent soft lockup BUG > starve out some part(s) of the I/O submission/completion processing). > I've no tangible evidence for either aside from the latter seems fairly > logical when you consider that the test consistently completes in 3-4 > seconds with the fix in place, but without it we consistently hit > multiple instances of the soft lockup detector (on ~20s intervals IIRC) > and the system seems to melt down indefinitely. *shrug* Ah, ok, so there wasn't any specific event that was causing AGF IO to take a long time, it's just that a thread running the filestream allocator could fail the trylock loop for any reason for long enough to trip the hangcheck warning. --D > Brian > > > > via the spinning task. This is reproduced by xfs/170. To avoid this > > > problem, fix the AGF trylock failure path to properly iterate to the > > > next AG. If a task iterates all AGs without making progress, the > > > trylock behavior is dropped in favor of blocking locks and thus a > > > soft lockup is no longer possible. > > > > > > Fixes: f48e2df8a877ca1c ("xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callers") > > > > Ooops, this was a major braino on my part. > > Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > --D > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > > --- > > > > > > I included the Fixes: tag because this looks like a regression in said > > > commit, but I've not explicitly verified. > > > > > > Brian > > > > > > fs/xfs/xfs_filestream.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c > > > index 6a3ce0f6dc9e..be9bcf8a1f99 100644 > > > --- a/fs/xfs/xfs_filestream.c > > > +++ b/fs/xfs/xfs_filestream.c > > > @@ -128,11 +128,12 @@ xfs_filestream_pick_ag( > > > if (!pag->pagf_init) { > > > err = xfs_alloc_pagf_init(mp, NULL, ag, trylock); > > > if (err) { > > > - xfs_perag_put(pag); > > > - if (err != -EAGAIN) > > > + if (err != -EAGAIN) { > > > + xfs_perag_put(pag); > > > return err; > > > + } > > > /* Couldn't lock the AGF, skip this AG. */ > > > - continue; > > > + goto next_ag; > > > } > > > } > > > > > > -- > > > 2.34.1 > > > > > >