On Mon, Oct 26, 2020 at 03:40:51PM -0700, Darrick J. Wong wrote: > On Tue, Oct 27, 2020 at 09:29:43AM +1100, Dave Chinner wrote: > > On Sat, Oct 24, 2020 at 09:41:14PM -0700, Darrick J. Wong wrote: > > > On Thu, Oct 22, 2020 at 04:15:31PM +1100, Dave Chinner wrote: > > > > @@ -140,6 +164,7 @@ workqueue_add( > > > > > > > > /* Now queue the new work structure to the work queue. */ > > > > pthread_mutex_lock(&wq->lock); > > > > +restart: > > > > if (wq->next_item == NULL) { > > > > assert(wq->item_count == 0); > > > > ret = -pthread_cond_signal(&wq->wakeup); > > > > @@ -150,6 +175,16 @@ workqueue_add( > > > > } > > > > wq->next_item = wi; > > > > } else { > > > > + /* throttle on a full queue if configured */ > > > > + if (wq->max_queued && wq->item_count == wq->max_queued) { > > > > + pthread_cond_wait(&wq->queue_full, &wq->lock); > > > > > > I ported xfs_scrub to use max_queued for the inode scanner, and got a > > > hang here. It uses two workqueues -- the first is an unbouned workqueue > > > that receives one work item per AG in which each work item calls > > > INUMBERS, creates a work item for the returned inode chunk, and throws > > > it at the second workqueue. The second workqueue is a bounded workqueue > > > that calls BULKSTAT on the INUMBERS work item and then calls the > > > iteration function on each bulkstat record returned. > > > > > > The hang happens when the inumbers workqueue has more than one thread > > > running. > > > > IIUC, that means you have multiple producer threads? IIRC, he usage > > in this patchset is single producer, so it won't hit this problem... > > Right, there are multiple producers, because that seemed like fun. > At this stage, I'm merely using this patch in anger (as it were) to > prototype a fix to scrub. > > I'm not even sure it's a reasonable enhancement for xfs_scrub since each > of those bulkstat workers will then fight with the inumbers workers for > the AGI, but so it goes. It does seem to eliminate the problem of one > thread working hard on an AG that has one huge fragmented file while the > other threads sit idle, but otherwise I mostly just see more buffer lock > contention. Yeah, that's exactly the problem it solves for phase6, too. i.e. it stops a single huge directory from preventing other directories in that AG from being processed even when everything else is done. > The total runtime doesn't change on more balanced > filesystems, at least. :P That's a start :) > > > I dunno if that's a sane way to structure an inumbers/bulkstat scan, but > > > it seemed reasonable to me. I can envision two possible fixes here: (1) > > > use pthread_cond_broadcast to wake everything up; or (2) always call > > > pthread_cond_wait when we pull a work item off the queue. Thoughts? > > > > pthread_cond_broadcast() makes more sense, but I suspect there will > > be other issues with multiple producers that render the throttling > > ineffective. I suspect supporting multiple producers should be a > > separate patchset... > > <nod> Making change (2) seems to work for multiple producers, but I > guess I'll keep poking at perf to see if I discover anything exciting. I'll look at the multiple producer thing in a bit more detail later today... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx