On Tue, Apr 08, 2014 at 08:36:19AM -0500, Eric Sandeen wrote: > On 4/8/14, 7:58 AM, Brian Foster wrote: > > On Mon, Apr 07, 2014 at 10:27:28PM -0500, Eric Sandeen wrote: > >> This fixes a regression caused by: > >> > >> 97b1fcf xfs_repair: fix array overrun in do_inode_prefetch > >> > >> The thread creation loop has 2 ways to exit; either via > >> the loop counter based on thread_count, or the break statement > >> if we've started enough workers to cover all AGs. > >> > >> Whether or not the loop counter "i" reflects the number of > >> threads started depends on whether or not we exited via the > >> break. > >> > >> The above commit prevented us from indexing off the end > >> of the queues[] array if we actually advanced "i" all the > >> way to thread_count, but in the case where we break, "i" > >> is one *less* than the nr of threads started, so we don't > >> wait for completion of all threads, and all hell breaks > >> loose in phase 5. > >> > >> Just stop with the cleverness of re-using the loop counter - > >> instead, explicitly count threads that we start, and then use > >> that counter to wait for each worker to complete. > >> > >> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > >> --- > >> > >> I have one fs which demonstrates the problem, and have verified > >> the regression & tested the fix against that. > >> > >> I'll run this over xfstests overnight, but it seems obvious > >> from here (OTOH the other fix seemed obvious too) :( > >> > >> diff --git a/repair/prefetch.c b/repair/prefetch.c > >> index e47a48e..4c32395 100644 > >> --- a/repair/prefetch.c > >> +++ b/repair/prefetch.c > >> @@ -944,6 +944,7 @@ do_inode_prefetch( > >> int i; > >> struct work_queue queue; > >> struct work_queue *queues; > >> + int queues_started = 0; > >> > >> /* > >> * If the previous phases of repair have not overflowed the buffer > >> @@ -987,6 +988,7 @@ do_inode_prefetch( > >> > >> create_work_queue(&queues[i], mp, 1); > >> queue_work(&queues[i], prefetch_ag_range_work, 0, wargs); > >> + queues_started++; > >> > >> if (wargs->end_ag >= mp->m_sb.sb_agcount) > >> break; > >> @@ -995,7 +997,7 @@ do_inode_prefetch( > >> /* > >> * wait for workers to complete > >> */ > >> - while (i--) > >> + for (i = 0; i < queues_started; i++) > >> destroy_work_queue(&queues[i]); > > > > Fix looks good, but any reason to reverse the order of the destroy loop? > > simplicity? :) > Fine by me. :) Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > I don't think it matters operationally... > > -Eric > > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs