On 12/7/16 12:30 AM, Dave Chinner wrote: > On Tue, Dec 06, 2016 at 05:00:47PM +0800, Hou Tao wrote: >> During the 2nd stage of log recovery, if the filesystem is firstly mounted >> as read-only, the unlink inodes will not be destroyed and the unlinked list >> in AGI will not be cleared. Even after a read-write remount or umount, >> the unlinked inodes will still be valid and be kept on disk, and the >> available freespace will be incorrect. >> >> To fix the problem, we need to force xfs_inactive() to destroy the >> unlinked inode when the filesystem is mounted as read-only. >> So clear the XFS_MOUNT_RDONLY flag temporarily before the recovery >> of unlinked inodes and restore the flag after the recovery has done. >> >> The problem can be reproduced by the following steps: >> 1. mount a xfs fs on a KVM VM >> 2. on the VM launch an application which does the following things: >> open(xfs_file); unlink(xfs_file); >> while(1) { write(xfs_file, 2MB); sleep(1); } >> 3. wait 5 seconds, sync the xfs fs, and wait 5 seconds >> 4. terminate the VM >> 5. start the VM and mount the xfs as read-only >> 6. remount the xfs as read-write or umount >> 7. check the unlinked list and the available freespace > > This is only papering over the larger problem. > > I was talking to Eric about this larger "recovery on read-only > mount" problem last week on IRC - I can't find it my logs right now, > but IIRC I'd suggested that we should always run xfs_mountfs() > in read-write mount if the underlying device can be written to, > and then once that is complete do a rw->ro transition exactly as we > do for a remount,ro operation. Yeah, I have a larger patchset to try to handle this and other related processing that wasn't happening on ro mounts. I got derailed because my regression test for it ran into all kinds of unexpected new & unrelated bugs. So I haven't sent it yet... There were lots of little bits here and there stemming, I think, from old Irix code that didn't do /any/ device IO on a ro mount. -Eric > That way we remove all the special "write on read only mount" hacks > we currently have throughout the code to enable log recovery to run > on read-only mounts. > > Essentially is requires moving the device read only check from the > log code to xfs_fs_fill_super() and handling the no-recovery flag > there before we call xfs_mountfs() and adding the rw->ro state > transition after we return. This will be much simpler and much more > reliable than trying to turn off "read only" state around certain > operations... > > Cheers, > > Dave. > -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html