On Sat, Dec 26, 2020 at 04:56:41AM -0500, Shijie Luo wrote: > The root cause is that when offline/onlines disks, the filesystem can easily get into > a error state and this makes it change to be read-only. Function freeze_super() will hold > all sb_writers rwsems including rwsem of SB_FREEZE_WRITE when filesystem not read-only, > but thaw_super_locked() cannot release these while the filesystem suddenly become read-only, > because the logic will go to out. > > freeze_super > hold sb_writers rwsems > sb->s_writers.frozen = SB_FREEZE_COMPLETE > thaw_super_locked > sb_rdonly > sb->s_writers.frozen = SB_UNFROZEN; > goto out // not release rwsems > > And at this time, if we call mnt_want_write(), the process will be blocked. > > This patch fixes this problem, when filesystem is read-only, just not to set sb_writers.frozen > be SB_FREEZE_COMPLETE in freeze_super() and then release all rwsems in thaw_super_locked. I really don't like that - you end up with a case when freeze_super() returns 0 *and* consumes the reference it had been give. > if (sb_rdonly(sb)) { > - /* Nothing to do really... */ > - sb->s_writers.frozen = SB_FREEZE_COMPLETE; > - up_write(&sb->s_umount); > + deactivate_locked_super(sb); > return 0; > }