Hi, On Sat, 2010-08-14 at 16:52 +0200, Christoph Hellwig wrote: > On Sat, Aug 14, 2010 at 10:14:51AM -0400, Ted Ts'o wrote: > > Also, to be clear, the block layer will guarantee that a trim/discard > > of block 12345 will not be reordered with respect to a write block > > 12345, correct? > > Right now that is what the hardbarrier does, and that's what we're > trying to get rid of. For XFS we prevent this by something that is > called the busy extent list - extents delete by a transaction are > inserted into it (it's actually a rbtree not a list these days), > and before we can reuse blocks from it we need to ensure that it > is fully commited. discards only happen off that list and extents > are only removed from it once the discard has finished. I assume > other filesystems have a similar mechanism. > GFS2 has a similar concept, which compares two bit maps to generate the extent list to generate the discards. This is done after each resource group has been committed to the journal, and just before the resource group bitmap is updated with the newly freed blocks (and marked dirty). Any remote node wanting to use that new space will cause a further journal flush when it requests the resource group lock (as well as in place write back of that resource group, of course). If the local node wants to reuse the recently freed space, then that can happen as soon as the log commit has finished, so in this case the barrier and the waiting are required. At the moment it seems to be doing that on every request, however there is no reason why we couldn't move the barrier to the end of the log flush code and have one per log flush conditional upon a discard having been issued (or some equivalent construct bearing in mind the objective of removing barriers), Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html