Re: [PATCH 6/8] bdi: add a new writeback list for sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/15/2015 07:12 AM, Jan Kara wrote:
On Thu 11-06-15 15:41:11, Josef Bacik wrote:
From: Dave Chinner <dchinner@xxxxxxxxxx>

wait_sb_inodes() current does a walk of all inodes in the filesystem
to find dirty one to wait on during sync. This is highly
inefficient and wastes a lot of CPU when there are lots of clean
cached inodes that we don't need to wait on.

To avoid this "all inode" walk, we need to track inodes that are
currently under writeback that we need to wait for. We do this by
adding inodes to a writeback list on the bdi when the mapping is
first tagged as having pages under writeback.  wait_sb_inodes() can
then walk this list of "inodes under IO" and wait specifically just
for the inodes that the current sync(2) needs to wait for.

To avoid needing all the realted locking to be safe against
interrupts, Jan Kara suggested that we be lazy about removal from
the writeback list. That is, we don't remove inodes from the
writeback list on IO completion, but do it directly during a
wait_sb_inodes() walk.

This means that the a rare sync(2) call will have some work to do
skipping clean inodes However, in the current problem case of
concurrent sync workloads, concurrent wait_sb_inodes() calls only
walk the very recently dispatched inodes and hence should have very
little work to do.

This also means that we have to remove the inodes from the writeback
list during eviction. Do this in inode_wait_for_writeback() once
all writeback on the inode is complete.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
Signed-off-by: Josef Bacik <jbacik@xxxxxx>

I don't think you noticed comments I did when you last posted this
series:

You're right I missed them completely, sorry.


  /*
- * Wait for writeback on an inode to complete. Caller must have inode pinned.
+ * Wait for writeback on an inode to complete during eviction. Caller must have
+ * inode pinned.
   */
  void inode_wait_for_writeback(struct inode *inode)
  {
+	BUG_ON(!(inode->i_state & I_FREEING));
+
  	spin_lock(&inode->i_lock);
  	__inode_wait_for_writeback(inode);
  	spin_unlock(&inode->i_lock);
+
+	/*
+	 * bd_inode's will have already had the bdev free'd so don't bother
+	 * doing the bdi_clear_inode_writeback.
+	 */
+	if (!sb_is_blkdev_sb(inode->i_sb))
+		bdi_clear_inode_writeback(inode_to_bdi(inode), inode);
  }

Why do we bother with not putting bdev inode back?


My memory is rusty on this, but if the inode is the inode for a bdev we will have already free'd the bdev at this point and we get a null pointer deref, so this just skips that bit. I'll fix up the indenting. Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux