On Tue, Feb 23, 2021 at 09:46:38AM -0600, Eric Sandeen wrote: > > > On 2/23/21 9:03 AM, Gao Xiang wrote: > > On Tue, Feb 23, 2021 at 08:40:56AM -0600, Eric Sandeen wrote: > >> On 2/23/21 7:42 AM, Gao Xiang wrote: > >>> Hi folks, > >>> > >>> On Wed, Mar 28, 2018 at 08:17:28AM +1100, Dave Chinner wrote: > >>>> On Mon, Mar 26, 2018 at 08:46:49AM -0400, Brian Foster wrote: > >>>>> On Sat, Mar 24, 2018 at 09:20:49AM -0700, Darrick J. Wong wrote: > >>>>>> On Wed, Mar 07, 2018 at 05:33:48PM -0600, Eric Sandeen wrote: > >>>>>>> Now that unlinked inode recovery is done outside of > >>>>>>> log recovery, there is no need to dirty the log on > >>>>>>> snapshots just to handle unlinked inodes. This means > >>>>>>> that readonly snapshots can be mounted without requiring > >>>>>>> -o ro,norecovery to avoid the log replay that can't happen > >>>>>>> on a readonly block device. > >>>>>>> > >>>>>>> (unlinked inodes will just hang out in the agi buckets until > >>>>>>> the next writable mount) > >>>>>> > >>>>>> FWIW I put these two in a test kernel to see what would happen and > >>>>>> generic/311 failures popped up. It looked like the _check_scratch_fs > >>>>>> found incorrect block counts on the snapshot(?) > >>>>>> > >>>>> > >>>>> Interesting. Just a wild guess, but perhaps it has something to do with > >>>>> lazy sb accounting..? I see we call xfs_initialize_perag_data() when > >>>>> mounting an unclean fs. > >>>> > >>>> The freeze is calls xfs_log_sbcount() which should update the > >>>> superblock counters from the in-memory counters and write them to > >>>> disk. > >>>> > >>>> If they are out, I'm guessing it's because the in-memory per-ag > >>>> reservations are not being returned to the global pool before the > >>>> in-memory counters are summed during a freeze.... > >>>> > >>>> Cheers, > >>>> > >>>> Dave. > >>>> -- > >>>> Dave Chinner > >>>> david@xxxxxxxxxxxxx > >>> > >>> I spend some time on tracking this problem. I've made a quick > >>> modification with per-AG reservation and tested with generic/311 > >>> it seems fine. My current question is that how such fsfreezed > >>> images (with clean mount) work with old kernels without [PATCH 1/1]? > >>> I'm afraid orphan inodes won't be freed with such old kernels.... > >>> Am I missing something? > >> > >> It's true, a snapshot created with these patches will not have their unlinked > >> inodes processed if mounted on an older kernel. I'm not sure how much of a > >> problem that is; the filesystem is not inconsistent, but some space is lost, > >> I guess. I'm not sure it's common to take a snapshot of a frozen filesystem on > >> one kernel and then move it back to an older kernel. Maybe others have > >> thoughts on this. Yes, I know of cloudy image generation factories that use old versions of RHEL to generate images that are then frozen and copied to a deployment system without an unmount. I don't understand why they insist that unmount is "too slow" but freeze isn't, nor why they then file bugs that their instance deploy process is unacceptably slow because of log recovery. > > My current thought might be only to write clean mount without > > unlinked inodes when freezing, but leave log dirty if any > > unlinked inodes exist as Brian mentioned before and don't > > handle such case (?). I'd like to hear more comments about > > this as well. > > I don't know if I had made this comment before ;) but I feel like that's even > more "surprise" (as in: gets further from the principle of least surprise) > and TBH I would rather not have that somewhat unpredictable behavior. > > I think I'd rather /always/ make a dirty log than sometimes do it, other > times not. It'd just be more confusion for the admin IMHO. ...but the next time anyone wants to introduce a new in/rocompat feature flag for something inode related, then you can disable the "leave a dirty log on freeze if there are unlinked inodes" behavior. --D > > Thanks, > -Eric > > > Thanks, > > Gao Xiang > > > >> > >> -Eric > >> > >