Re: [RFCv4 00/76] xfs: add reverse-mapping, reflink, and dedupe support

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 6 Jan 2016 14:44:15 +1100

On Tue, Jan 05, 2016 at 06:04:40PM -0800, Darrick J. Wong wrote:
> On Tue, Jan 05, 2016 at 07:42:26AM -0500, Brian Foster wrote:
> > On Mon, Jan 04, 2016 at 03:59:51PM -0800, Darrick J. Wong wrote:
> > > I've temporarily fixed this by adding code that figures out how many blocks we
> > > need if the reference count btree has to have a unique record for every block
> > > in the AG and holding that many blocks until either they're allocated to the
> > > refcount btree or freed at umount time.  Right now it's a temporary fix (if the
> > > FS crashes, the reserved blocks are lost) but it wouldn't be difficult for the
> > > FS to make a permanent reservation that's recorded on disk somehow.  But that's
> > > involves writing things to disk + making xfsprogs understand the reservation;
> > > let's see what people say about the reserved pool idea at all.
> > > 
> > > Does that make sense? :)
> > > 
> > 
> > Yep, it sounds sort of like the reserve pool mechanism used to protect
> > against ENOSPC when freeing blocks. Curious... why are the reserved
> > blocks lost on fs crash? Wouldn't they be reserved again on the
> > subsequent mount?
> 
> They will, but the pre-crash reservation isn't (yet) written down anywhere on
> disk.

Does it need to be? The global reserve pool is not "written down"
anywhere. When we mount, we pull the reserve from the global free
space accounting. Hence we given ENOSPC when we've used "total fs
blocks - reserve pool blocks" in memory, and so if we crash we've
still got at least that many free blocks on disk. hence on mount we
re-reserve those blocks in memory and everything is back to the way
it was prior to the crash.

I suspect the per-ag code is a bit different, but it should be able
to work the same way. i.e. when we initialise the per-ag structure,
we pull the reserve from the free block count in the AG, as well as
from the global free space count. Then we will get correct global
ENOSPC detection, as well as leave enough space free in each AG as
we scan and skip them during allocation...

As long as the per-ag reservation is restored during mount before we
do EFI recovery processing (i.e. between the two log recovery
phases), it should restore the reserve pool to the same size as it
was before a crash occurred....

Unless, of course, I'm missing something newly introduced by the
reflink code...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs