On Tue, Nov 06, 2018 at 09:23:11AM -0500, Brian Foster wrote: > On Mon, Nov 05, 2018 at 01:51:39PM -0800, Christoph Hellwig wrote: > > On Mon, Nov 05, 2018 at 01:10:21PM -0500, Brian Foster wrote: > > > When online discard is enabled, discards of busy extents are > > > submitted asynchronously as a bio chain. bio completion and > > > resulting busy extent cleanup is deferred to a workqueue. Async > > > discard submission is intended to avoid blocking log forces on a > > > full discard sequence which can take a noticeable amount of time in > > > some cases. > > > > > > We've had reports of this still producing log force stalls with XFS > > > on VDO, > > > > Please fix this in VDO instead. We should not work around out of > > tree code making stupid decisions. > > I assume the "stupid decision" refers to sync discard execution. I'm not > familiar with the internals of VDO, this is just what I was told. IMO, what VDO does is irrelevant - any call to submit_bio() can block if the request queue is full. Hence if we've drowned the queue in discards and the device is slow at discards, then we are going to block submitting discards. > My > understanding is that these discards can stack up and take enough time > that a limit on outstanding discards is required, which now that I think > of it makes me somewhat skeptical of the whole serial execution thing. > Hitting that outstanding discard request limit is what bubbles up the > stack and affects XFS by holding up log forces, since new discard > submissions are presumably blocked on completion of the oldest > outstanding request. Exactly. > I'm not quite sure what happens in the block layer if that limit were > lifted. Perhaps it assumes throttling responsibility directly via > queues/plugs? I'd guess that at minimum we'd end up blocking indirectly > somewhere (via memory allocation pressure?) anyways, so ISTM that some > kind of throttling is inevitable in this situation. What am I missing? We still need to throttle discards - they have to play nice with all the other IO we need to dispatch concurrently. I have two issues with the proposed patch: 1. it puts both discard dispatch and completion processing on the one work qeueue, so if the queue is filled with dispatch requests, IO completion queuing gets blocked. That's not the best thing to be doing. 2. log forces no longer wait for discards to be dispatched - they just queue them. This means the mechanism xfs_extent_busy_flush() uses to dispatch pending discards (synchrnous log force) can return before discards have even been dispatched to disk. Hence we can expect to see longer wait and tail latencies when busy extents are encountered by the allocator. Whether this is a problem or not needs further investigation. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx