Re: [RFC PATCH 3/3] DIO: don't fall back to buffered writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 01, 2006 at 07:47:19AM -0500, Chris Mason wrote:
> On Wed, Nov 01, 2006 at 04:21:37PM +0530, Suparna Bhattacharya wrote:
> > > +	/*
> > > +	 * extend the file if needed, and drop i_mutex for non-aio writes.
> > > +	 * We extend the file by creating a hole that is later filled in
> > > +	 * during the O_DIRECT.  Because pages are locked or placeholders
> > > +	 * are inserted, the locking rules end up being the same as
> > > +	 * mmap'd writes using writepage to fill holes
> > 
> > I like the idea of extending the size in the beginning to avoid having
> > to hold i_sem for the extended writes case. The only question is whether
> > there is a semantics issue here - e.g. if there isn't enough space, would the
> > app which did an append still see the size change even though the blocks
> > themselves are not populated ? Of course we already have this happen in the
> > -EIO case for AIO-DIO writes, no am not sure if this is an issue.
> 
> You're right, there is a semantic issue, I'm being a little loose with
> the traditional write() expectations.  The current code creates a hole
> and does not remove it in the face of errors.
> 
> > 
> > > +	 */
> > > +	dio->reacquire_i_mutex = 0;
> > > +	if (dio_lock_type == DIO_LOCKING && (rw & WRITE)) {
> > 
> > Hmm, we really do want to be able to avoid all these various locking
> > mode checks inside the DIO code to simplify things. I thought you
> > had mostly managed to do away with these in your previous patch. 
> > Unfortunately this change while it should help concurrency, brings
> > that check back in. One of the plans I had earlier was to pull this
> > up to a higher level - i.e. in the locking version of blockdev_direct_IO()
> > thus doing away with the flag and related confusion. Do you think
> > that would make sense ?
> 
> I went the other direction, making the flags a little more formal (patch
> will go out today).  I don't have strong feelings about this, but it
> seems easiest to me to keep the flags inside blockdev_direct_IO.

Do we have some kind of description of what conditions each flags would
be used / who would use them (not just what they do). For example I
have forgotton why DIO_CREATE was actually introduced for XFS. Is it
still needed now that we no longer needed to do the buffered IO fallback
for overwriting holes ? 

That would help the discussion.

And then there used to be special flags for cluster
filesystems which was introduced by GFS ... all of it got to be quite
confusing. The formality helps, but only to an extent.

> 
> > 
> > 
> > > +		/* if our write goes past i_size, do an expanding
> > > +		 * truncate to fill it before dropping i_mutex
> > > +		 */
> > > +		if (end > i_size_read(inode) && iocb->ki_filp) {
> > > +			struct iattr newattrs;
> > > +			newattrs.ia_size = end;
> > > +			newattrs.ia_file = iocb->ki_filp;
> > > +			newattrs.ia_valid = ATTR_SIZE | ATTR_FILE;
> > > +			retval = notify_change(iocb->ki_filp->f_dentry,
> > > +					       &newattrs);
> > > +			if (retval)
> > > +				goto out;
> > > +		}
> > > +		if (is_sync_kiocb(iocb)) {
> > > +			dio->reacquire_i_mutex = 1;
> > > +			mutex_unlock(&inode->i_mutex);
> > 
> > BTW, don't the page locks cover the AIO case as well ? I'm wondering
> > why we need this special case here.
> 
> Only because i_mutex is already dropped implicitly by the aio code when
> we return.  Ideally we could drop it sooner inside fs/direct-io.c, but
> I'll leave that tuning for a later patch.

OK, I see.

There is one other thing that I'm not sure we have covered yet. That is
ensuring ordered mode guarantees with AIO-DIO. As long as we were falling
back to buffered IO in cases involving block allocation or forcing extends
to be synchronous, we probably didn't need to worry about this. How do
we make sure we address that now.

Regards
Suparna

> 
> -chris

-- 
Suparna Bhattacharya (suparna@xxxxxxxxxx)
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux