Re: [PATCH] ext4: hand over jobs handling discard commands on commit complete phase to kworkers

Jan Kara <jack@xxxxxxx> · Tue, 16 May 2017 17:11:53 +0200

On Tue 16-05-17 16:37:42, Daeho Jeong wrote:
> Now, when we mount ext4 filesystem with '-o discard' option, we have to
> issue all the discard commands for the blocks to be deallocated and
> wait for the completion of the commands on the commit complete phase.
> Because this procedure might involve a lot of sequential combinations of
> issuing discard commands and waiting for that, the delay of this
> procedure might be too much long, even to half a minute in our test,
> and it results in long commit delay and fsync() performance degradation.
> 
> If we process these jobs handlig discard commands with work items and
> delegate the jobs to kworkers, the journaling thread doesn't need to
> wait for the discard command handling anymore, and the sequentially
> handled discard commands will be handled in parallel by several kworkers
> and discard command handling performance also will be enhanced.
> By doing this, half a minute delay of a single commit in the worst case
> has been enhanced to 255ms in our test.
> 
> Signed-off-by: Daeho Jeong <daeho.jeong@xxxxxxxxxxx>
> Tested-by: Hobin Woo <hobin.woo@xxxxxxxxxxx>
> Tested-by: Kitae Lee <kitae87.lee@xxxxxxxxxxx>

So I see several problems with this. Firstly, it breaks the ENOSPC handling
logic which relies on the fact that after forcing a transaction commit all
blocks held by the transaction are released - now they will be released
only after the work is completed and thus we can prematurely report ENOSPC.
Secondly, offloading the discard work doesn't change the fundamental fact
that discard is slow (for some devices) and this change just hides this
fact at the possible cost of for example higher file fragmentation as a
result of delayed block freeing. Also the outstanding queue of discard
requests isn't limited in any way again leading to possible strange
allocation / ENOSPC issues.

So I agree with Christoph that you should rather submit discards directly
from jbd2 thread as we do now, just submit all of them and then wait for
completion to allow parallel processing in the device. And if the device
doesn't support fast enough parallel processing of discards then 'discard'
mount option isn't really suited for such device and no amount of
offloading is going to fix that fact.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR