Re: [PATCH 6/8] bdi: add a new writeback list for sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 17-06-15 10:55:58, Josef Bacik wrote:
> On 06/17/2015 03:34 AM, Jan Kara wrote:
> >On Tue 16-06-15 08:42:27, Josef Bacik wrote:
> >>>>  /*
> >>>>- * Wait for writeback on an inode to complete. Caller must have inode pinned.
> >>>>+ * Wait for writeback on an inode to complete during eviction. Caller must have
> >>>>+ * inode pinned.
> >>>>   */
> >>>>  void inode_wait_for_writeback(struct inode *inode)
> >>>>  {
> >>>>+	BUG_ON(!(inode->i_state & I_FREEING));
> >>>>+
> >>>>  	spin_lock(&inode->i_lock);
> >>>>  	__inode_wait_for_writeback(inode);
> >>>>  	spin_unlock(&inode->i_lock);
> >>>>+
> >>>>+	/*
> >>>>+	 * bd_inode's will have already had the bdev free'd so don't bother
> >>>>+	 * doing the bdi_clear_inode_writeback.
> >>>>+	 */
> >>>>+	if (!sb_is_blkdev_sb(inode->i_sb))
> >>>>+		bdi_clear_inode_writeback(inode_to_bdi(inode), inode);
> >>>>  }
> >>>
> >>>Why do we bother with not putting bdev inode back?
> >>>
> >>
> >>My memory is rusty on this, but if the inode is the inode for a bdev
> >>we will have already free'd the bdev at this point and we get a null
> >>pointer deref, so this just skips that bit.
> >
> >Ah, the reason likely is that bdev->bd_disk is NULL (already cleaned up in
> >__blkdev_put()) at this moment and thus bdev_get_queue() called from
> >inode_to_bdi() will oops. Can you please add these details to the comment?
> >It's a bit subtle...
> >
> >Also we shouldn't have any pages in the block device mapping anymore
> >because of the work done in __blkdev_put() (and thus inode shouldn't be in
> >the writeback list) but I'd be calmer if we asserted
> >list_empty(&inode->i_wb_list). Can you please add that? Thanks!
> 
> Won't it trip if we never sync before we drop the device tho?  So we
> write some stuff to the block device, it gets written out, we then
> drop the device for whatever reason and boom, hit
> BUG_ON(&inode->i_wb_list) because we're still on the writeback list
> even though it doesn't matter because this disk is going away.  Just
> an untested theory, what do you think?

Well, we are going to free the inode. If it is still linked into
the writeback list, we are in trouble as the list will now contain freed
element. So better BUG_ON earlier than chase memory corruption later.

Calling bdi_clear_inode_writeback() in kill_bdev() is probably a good idea
and we can then just have the assertion in evict() to make sure nothing
went wrong. OK?

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux