Re: trim performance on xfs (also reply to: Online TRIM/discard performance impact)

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 15 Dec 2011 16:49:34 +1100

On Thu, Dec 15, 2011 at 04:14:28AM +0200, Andrei Purdea wrote:
> I also recently met with an issue that has been described in a
> previous email(http://oss.sgi.com/archives/xfs/2011-11/msg00108.html).
> I have an OCZ Agility 3 (also a SandForce 2281 controller),and a
> Corsair Performance Pro (non-SandForce controller,with good
> non-compressible data speeds)
> Removing a linux-3.2-rc5 tree took more then 6 minutes with online
> discard,and ext4 was very speedy (less then a second), but only on the
> SandForce controller!!For the Corsair performance was good on both
> filesystems.
> So I decided to debug what is happening.I also noticed that fstrim was
> slower on the OCZ(30-40 seconds, vs. 3 seconds on the Corsair)So I had
> a hunch that the performance difference comes from the number
> ofsectors that are discarded in a single TRIM command.
> So i compiled a kernel where I inserted a printk into
> blkdev_issue_discard(),to see how many sectors each call asks for.

XFS has a trace point so will issue an event for every discard it
issues. You can look at this without needing to add printk()s to the
code....

> Results:
> Total number of sectors discarded:   ext4: 1032864   XFS: 1043112
> (about the same)Number of calls to blkdev_issue_discard():   ext4: 24
>           XFS: 39395Average number of sectors discarded per call:
> ext4: 43036 (21 MiB)   XFS: 26.48  (13.2 KiB)

XFS is doing fine grained discards, just like mount option is asking
for, but the block layer does not allow async dispatch/completion of
discard requests so they are issued serially.  Hence the speed of
online discard is a purely a function of number * duration.

Fixing this properly is two fold - allowing async discard dispatch,
and then change XFS to use a 2-pass dispatch/wait-for-completion
model. This also assumes that the block layer can sort/merge separate
discard requests to reduce the number of requests sent to the
device, which I don't think it can do right now and would also need
fixing.

We could probably add sorting before dispatching at the XFS layer,
but sorting/merging IO requests is really a function of the block
layer, not the filesystem...

> I made some TRIM performance measurements (see image on
> http://purdea.ro/projects/trim_perf/)with a small tool i wrote.
> (warning, destructive to data)
> It looks like the OCZ needs a really high amount of time to
> executeeven the smallest TRIM command.
> I made a small post about this here: http://purdea.ro/projects/trim_perf/
> Can this be fixed in XFS?
> I am curious, do all SandForce controllers exhibit this slow TRIM issue?

The slowness of individual TRIM commands cannot be fixed in XFS -
it' aproperty of the SSD. And no, it's not unique to Sandforce
controllers - there are other types of controllers that have
similiar (or worse) TRIM behaviour.  Indeed, there have been
occurences of SSDs being bricked by being sent too many TRIM
commands too often/quickly...

This is one of the reasons why our current recommendation is to not
use online discard, but use background discard (FITRIM) to
periodically issue discards on the free space in the filesystem. It
does not happen in a filesystem performance fast-path, can be
configured with a minimum size to trim, and usually requires much
fewer TRIM commands to be sent to the device so is generally safer.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs