On Tue, Jan 05, 2016 at 06:04:40PM -0800, Darrick J. Wong wrote: > On Tue, Jan 05, 2016 at 07:42:26AM -0500, Brian Foster wrote: > > On Mon, Jan 04, 2016 at 03:59:51PM -0800, Darrick J. Wong wrote: > > > I've temporarily fixed this by adding code that figures out how many blocks we > > > need if the reference count btree has to have a unique record for every block > > > in the AG and holding that many blocks until either they're allocated to the > > > refcount btree or freed at umount time. Right now it's a temporary fix (if the > > > FS crashes, the reserved blocks are lost) but it wouldn't be difficult for the > > > FS to make a permanent reservation that's recorded on disk somehow. But that's > > > involves writing things to disk + making xfsprogs understand the reservation; > > > let's see what people say about the reserved pool idea at all. > > > > > > Does that make sense? :) > > > > > > > Yep, it sounds sort of like the reserve pool mechanism used to protect > > against ENOSPC when freeing blocks. Curious... why are the reserved > > blocks lost on fs crash? Wouldn't they be reserved again on the > > subsequent mount? > > They will, but the pre-crash reservation isn't (yet) written down anywhere on > disk. Does it need to be? The global reserve pool is not "written down" anywhere. When we mount, we pull the reserve from the global free space accounting. Hence we given ENOSPC when we've used "total fs blocks - reserve pool blocks" in memory, and so if we crash we've still got at least that many free blocks on disk. hence on mount we re-reserve those blocks in memory and everything is back to the way it was prior to the crash. I suspect the per-ag code is a bit different, but it should be able to work the same way. i.e. when we initialise the per-ag structure, we pull the reserve from the free block count in the AG, as well as from the global free space count. Then we will get correct global ENOSPC detection, as well as leave enough space free in each AG as we scan and skip them during allocation... As long as the per-ag reservation is restored during mount before we do EFI recovery processing (i.e. between the two log recovery phases), it should restore the reserve pool to the same size as it was before a crash occurred.... Unless, of course, I'm missing something newly introduced by the reflink code... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs