On Wed, Jan 06, 2016 at 02:44:15PM +1100, Dave Chinner wrote: > On Tue, Jan 05, 2016 at 06:04:40PM -0800, Darrick J. Wong wrote: > > On Tue, Jan 05, 2016 at 07:42:26AM -0500, Brian Foster wrote: > > > On Mon, Jan 04, 2016 at 03:59:51PM -0800, Darrick J. Wong wrote: > > > > I've temporarily fixed this by adding code that figures out how many blocks we > > > > need if the reference count btree has to have a unique record for every block > > > > in the AG and holding that many blocks until either they're allocated to the > > > > refcount btree or freed at umount time. Right now it's a temporary fix (if the > > > > FS crashes, the reserved blocks are lost) but it wouldn't be difficult for the > > > > FS to make a permanent reservation that's recorded on disk somehow. But that's > > > > involves writing things to disk + making xfsprogs understand the reservation; > > > > let's see what people say about the reserved pool idea at all. > > > > > > > > Does that make sense? :) > > > > > > > > > > Yep, it sounds sort of like the reserve pool mechanism used to protect > > > against ENOSPC when freeing blocks. Curious... why are the reserved > > > blocks lost on fs crash? Wouldn't they be reserved again on the > > > subsequent mount? > > > > They will, but the pre-crash reservation isn't (yet) written down anywhere on > > disk. > > Does it need to be? The global reserve pool is not "written down" > anywhere. When we mount, we pull the reserve from the global free > space accounting. Hence we given ENOSPC when we've used "total fs > blocks - reserve pool blocks" in memory, and so if we crash we've > still got at least that many free blocks on disk. hence on mount we > re-reserve those blocks in memory and everything is back to the way > it was prior to the crash. > > I suspect the per-ag code is a bit different, but it should be able > to work the same way. i.e. when we initialise the per-ag structure, > we pull the reserve from the free block count in the AG, as well as > from the global free space count. Then we will get correct global > ENOSPC detection, as well as leave enough space free in each AG as > we scan and skip them during allocation... > > As long as the per-ag reservation is restored during mount before we > do EFI recovery processing (i.e. between the two log recovery > phases), it should restore the reserve pool to the same size as it > was before a crash occurred.... > > Unless, of course, I'm missing something newly introduced by the > reflink code... Technically you were, but I've fixed the reservation code to exist purely as in-core magic that works more or less how you outlined above. No more on-disk artifacts, no more need to write a persistence and recovery mechanism. :) --D > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs