Re: [PATCH v4 RESEND 2/2] buffer: record blockdev write errors in super_block that it backs

Jan Kara <jack@xxxxxxx> · Wed, 15 Apr 2020 11:17:46 +0200

On Tue 14-04-20 14:37:21, Jeff Layton wrote:
> On Tue, 2020-04-14 at 18:26 +0200, Jan Kara wrote:
> > On Tue 14-04-20 08:04:09, Jeff Layton wrote:
> > > From: Jeff Layton <jlayton@xxxxxxxxxx>
> > > 
> > > When syncing out a block device (a'la __sync_blockdev), any error
> > > encountered will only be recorded in the bd_inode's mapping. When the
> > > blockdev contains a filesystem however, we'd like to also record the
> > > error in the super_block that's stored there.
> > > 
> > > Make mark_buffer_write_io_error also record the error in the
> > > corresponding super_block when a writeback error occurs and the block
> > > device contains a mounted superblock.
> > > 
> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > 
> > The patch looks good to me. I'd just note that bh->b_bdev->bd_super
> > dereference is safe only because we will flush all dirty data when
> > unmounting a filesystem which is somewhat tricky. Maybe that warrants a
> > comment? Otherwise feel free to add:
> > 
> > Reviewed-by: Jan Kara <jack@xxxxxxx>
> 
> Oh, hmm...now that I look again, I'm not sure this is actually safe.
> 
> bh->b_bdev gets cleared out as we discard the buffer, so I don't think
> that could end up getting zeroed while we're still using it.

Correct.

> The bd_super pointer gets zeroed out in kill_block_super, and after that
> point it calls sync_blockdev(). Could writeback error processing race
> with kill_block_super such that bd_inode gets set to NULL after we test
> it but before we dereference it?

Yeah, you're right. But you can avoid the race with
READ_ONCE(bh->b_bdev->bd_super) and a big fat comment explaining why it is
safe... :)

Or you could be less daring and put rcu protection there because
superblocks are RCU freed...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR