Hi Dave, I believe I understand your intent of having the discard tree almost fully disconnected from the busy extents tree. I planned to use the the same pagb_lock for the discard tree as well, which solves the atomicity and ordering issues you have described. If we use a different lock (mutex), then some additional care is needed. One thing I like less about such approach, is that we need to be careful of not growing the discard tree too large. If, for some reason, the worker thread is not able to keep up discarding extents while they are queued for discard, we may end up with too many discard ranges in the tree. Also at the same time user can trigger an offline discard, which will add more discard ranges to the tree. While the online discard that you have, kind of throttles itself. So if the underlying block device is slow on discarding, the whole system will be slowed down accordingly. [snip] I have one additional question regarding your comment on metadata >> Q1: why it is ok to do so? why it is ok for "metadata" to reuse part >> (or all) of the busy extent before its extent-free-intent is >> committed? > > The metadata changes are logged, and crash recovery allows correct > ordering of the free, realloc and modify process for metadata. Hence > it doesn't matter that we overwrite the contents of the block before > the free transaction is on disk - the correct contents will always > be present after recovery. > > We can't do that for user data because we don't log user data. > Therefore if we allow user data to overwrite the block whil eit is > still busy, crash recovery may not result in the block having the > correct contents (i.e. the transaction that freed the block never > reaches the journal) and we hence expose some other user's data or > metadata in the file. If that is the case, why cannot we just issue an async discard before the busy extent is committed? I understand that if we crashed, we might have knocked off some of the user data (or changed it to some new data). But can XFS get corrupted (unmountable) this way? You said earlier that it can, but now you are saying that reusing a metadata block, before its busy extent commits, is fine. Thanks, Alex. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs