On Thu 17-08-23 16:54:32, Christian Brauner wrote: > On Thu, Aug 17, 2023 at 04:37:36PM +0200, Jan Kara wrote: > > On Thu 17-08-23 12:47:44, Christian Brauner wrote: > > > Recent rework moved block device closing out of sb->put_super() and into > > > sb->kill_sb() to avoid deadlocks as s_umount is held in put_super() and > > > blkdev_put() can end up taking s_umount again. > > > > > > That means we need to move the removal of the superblock from @fs_supers > > > out of generic_shutdown_super() and into deactivate_locked_super() to > > > ensure that concurrent mounters don't fail to open block devices that > > > are still in use because blkdev_put() in sb->kill_sb() hasn't been > > > called yet. > > > > > > We can now do this as we can make iterators through @fs_super and > > > @super_blocks wait without holding s_umount. Concurrent mounts will wait > > > until a dying superblock is fully dead so until sb->kill_sb() has been > > > called and SB_DEAD been set. Concurrent iterators can already discard > > > any SB_DYING superblock. > > > > > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > > > --- > > > fs/super.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++----- > > > include/linux/fs.h | 1 + > > > 2 files changed, 66 insertions(+), 6 deletions(-) > > > > <snip> > > > > > @@ -456,6 +497,25 @@ void deactivate_locked_super(struct super_block *s) > > > list_lru_destroy(&s->s_dentry_lru); > > > list_lru_destroy(&s->s_inode_lru); > > > > > > + /* > > > + * Remove it from @fs_supers so it isn't found by new > > > + * sget{_fc}() walkers anymore. Any concurrent mounter still > > > + * managing to grab a temporary reference is guaranteed to > > > + * already see SB_DYING and will wait until we notify them about > > > + * SB_DEAD. > > > + */ > > > + spin_lock(&sb_lock); > > > + hlist_del_init(&s->s_instances); > > > + spin_unlock(&sb_lock); > > > + > > > + /* > > > + * Let concurrent mounts know that this thing is really dead. > > > + * We don't need @sb->s_umount here as every concurrent caller > > > + * will see SB_DYING and either discard the superblock or wait > > > + * for SB_DEAD. > > > + */ > > > + super_wake(s, SB_DEAD); > > > + > > > put_filesystem(fs); > > > put_super(s); > > > } else { > > > @@ -638,15 +698,14 @@ void generic_shutdown_super(struct super_block *sb) > > > spin_unlock(&sb->s_inode_list_lock); > > > } > > > } > > > - spin_lock(&sb_lock); > > > - /* should be initialized for __put_super_and_need_restart() */ > > > - hlist_del_init(&sb->s_instances); > > > - spin_unlock(&sb_lock); > > > > OK, but we have several checks of hlist_unhashed(&sb->s_instances) in the > > code whose meaning is now subtly changed. We have: > > If by changed meaning you mean they can be dropped, then yes. > That's what I understand you as saying given the following list. Yes, they can be all dropped but we probably need SB_DYING check in trylock_super() and preferably in __iterate_supers() as well. > > trylock_super() - needs SB_DYING check instead of s_instances check > > __iterate_supers() - probably we should add SB_DYING check to not block > > emergency operations on s_umount unnecessarily and drop s_instances > > check > > iterate_supers() - we can drop s_instances check > > get_super() - we can drop s_instances check > > get_active_super() - we can drop s_instances check > > user_get_super() - we can drop s_instances check > > But does this otherwise look reasonable? Yes, otherwise the patch looks good to me. > (Btw, just because I noticed it, do you prefer suse.cz or suse.com?) I prefer suse.cz because suse.com comes through MS Exchange before getting into our Linux mailing system and that occasionally causes trouble. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR