Re: [PATCH 2/2] xfs: add 'discard_sync' mount flag

Jens Axboe <axboe@xxxxxxxxx> · Mon, 30 Apr 2018 15:42:11 -0600

On 4/30/18 3:31 PM, Dave Chinner wrote:
> On Mon, Apr 30, 2018 at 09:32:52AM -0600, Jens Axboe wrote:
>> XFS recently added support for async discards. While this can be
>> a win for some workloads and devices, there are also cases where
>> async bursty discard will severly harm the latencies of reads
>> and writes.
> 
> FWIW, convention is it document the performance regression in the
> commit message, not leave the reader to guess at what it was....

Yeah I'll give you that, I can improve the commit message for sure.

> Did anyone analyse the pattern of discards being issued to work out
> what pattern was worse for async vs sync discard? is it lots of
> little discards, large extents being discarded, perhaps a problem
> with the request request queue starving other IOs because we queue
> so many async discards in such a short time (which is the difference
> in behaviour vs the old code), or something else?

What was observed was a big discard which would previously have
gone down as smaller discards now going down as either one or many
discards. Looking at the blktrace data, it's the difference between

discard 1 queue
discard 1 complete
discatd 2 queue
discard 2 complete
[...]
discard n queue
discard n complete

which is now

discard 1 queue
discard 2 queue
[...]
discard n queue
[...]
discard 1 complete
discard 2 complete
[...]
discard n complete

Note that we set a max discard size of 64MB for most devices,
since it's been shown to have less impact on latencies for
the IO that jobs actually care about.

>> Add a 'discard_sync' mount flag to revert to using sync discard,
>> issuing them one at the time and waiting for each one. This fixes
>> a big performance regression we had moving to kernels that include
>> the XFS async discard support.
> 
> I'm not a fan of adding a mount option to work around bad,
> unpredictable performance due to a mount option we recommend you
> don't use because it results in bad, unpredictable performance.

Oh I hear you, as I wrote in other replies, I don't generally
recommend discard except for cases where it's been proven to be
useful in terms of write amplification improvements. If we can
avoid using it, we do. It's a tradeoff, and for some situations,
the right decision is to use discards.

> Without any details of the discard pattern that results in problems
> I don't think we should be changing anything - adding an opaque,
> user-unfriendly mount option does nothing to address the underlying
> problem - it's just a hack to work around the symptoms being seen...
> 
> More details of the regression and the root cause analysis is
> needed, please.

It brings back the same behavior as we had before, which performs
better for us. It's preventing users of XFS+discard from upgrading,
which is sad.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html