Re: discard and barriers

Steven Whitehouse <swhiteho@xxxxxxxxxx> · Mon, 16 Aug 2010 10:41:51 +0100

Hi,

On Sat, 2010-08-14 at 16:52 +0200, Christoph Hellwig wrote:
> On Sat, Aug 14, 2010 at 10:14:51AM -0400, Ted Ts'o wrote:
> > Also, to be clear, the block layer will guarantee that a trim/discard
> > of block 12345 will not be reordered with respect to a write block
> > 12345, correct?
> 
> Right now that is what the hardbarrier does, and that's what we're
> trying to get rid of.  For XFS we prevent this by something that is
> called the busy extent list - extents delete by a transaction are
> inserted into it (it's actually a rbtree not a list these days),
> and before we can reuse blocks from it we need to ensure that it
> is fully commited.  discards only happen off that list and extents
> are only removed from it once the discard has finished.  I assume
> other filesystems have a similar mechanism.
> 
GFS2 has a similar concept, which compares two bit maps to generate the
extent list to generate the discards. This is done after each resource
group has been committed to the journal, and just before the resource
group bitmap is updated with the newly freed blocks (and marked dirty).

Any remote node wanting to use that new space will cause a further
journal flush when it requests the resource group lock (as well as in
place write back of that resource group, of course).

If the local node wants to reuse the recently freed space, then that can
happen as soon as the log commit has finished, so in this case the
barrier and the waiting are required. At the moment it seems to be doing
that on every request, however there is no reason why we couldn't move
the barrier to the end of the log flush code and have one per log flush
conditional upon a discard having been issued (or some equivalent
construct bearing in mind the objective of removing barriers),

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html