Re: [PATCH] direct-io: only inc/dec inode->i_dio_count for file systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 15 Apr 2015 13:26:51 -0600 Jens Axboe <axboe@xxxxxx> wrote:

> > Is there similar impact to direct-io-to-file?  It would be nice to fix
> > that up also.  Many filesystems do something along the lines of
> >
> > 	atomic_inc(i_dio_count);
> > 	wibble()
> > 	atomic_dev(i_dio_count);
> > 	__blockdev_direct_IO(...);
> >
> > and with your patch I think we could change them to
> >
> > 	atomic_inc(i_dio_count);
> > 	wibble()
> > 	__blockdev_direct_IO(..., flags|DIO_IGNORE_TRUNCATE);
> > 	atomic_dev(i_dio_count);
> >
> > which would halve the atomic op load.
> 
> I haven't checked pure file, but without extending, I suspect that we 
> should see similar benefits there. In any case, it'd make sense doing, 
> having twice the atomic inc/dec is just a bad idea in general, if we can 
> get rid of it.
> 
> A quick grep doesn't show this use case, or I'm just blind. Where do you 
> see that?

btrfs_direct_IO() holds i_dio_count across its call to
__blockdev_direct_IO() for writes.  That makes the dio_bio_count
manipulation in do_blockdev_direct_IO() unneeded?  ext4 is similar.

Reducing from 4 ops to 2 probably won't make as much difference as
reducing from 2 to 0 - most of the cost comes from initially grabbing that
cacheline from a different CPU.

> > But that's piling hack on top of hack.  Can we change the
> 
> I'd more view it as reducing the hack, the real hack is the way that we 
> manually do atomic_inc() on i_dio_count, and then call a magic 
> inode_dio_done() that decrements it again. It's not very pretty, I'm 
> just reducing the scope of the hack :-)

A magic flag which says "you don't need to do this in that case because
I know this inode is special".  direct-io already has too much of this :(

> > do_blockdev_direct_IO() interface to "caller shall hold i_mutex, or
> > increment i_dio_count"?  ie: exclusion against truncate is wholly the
> > caller's responsibility.  That way, this awkward sharing of
> > responsibility between caller and callee gets cleaned up and
> > DIO_IGNORE_TRUNCATE goes away.
> 
> That would clean it up, at the expense of touching more churn. I'd be 
> fine with doing it that way.

OK, could be done later I suppose.

> > inode_dio_begin() would be a good place to assert that i_mutex is held,
> > btw.
> 
> How would you check it? If the i_dio_count is bumped, then you'd not 
> need to hold i_mutex.

	if (atomic_add_return() == 1)
		assert()

I guess.  It was just a thought.  Having wandered around the code, I'm
not 100% confident that everyone is holding i_mutex - it's not all
obviously correct.

otoh, the caller doesn't *have* to choose i_mutex for the external
exclusion, and perhaps some callers have used something else.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux