Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 10, 2007 at 05:26:20PM +0530, Amit K. Arora wrote:
> On Thu, May 10, 2007 at 10:59:26AM +1000, David Chinner wrote:
> > On Wed, May 09, 2007 at 09:31:02PM +0530, Amit K. Arora wrote:
> > > I have the updated patches ready which take care of Andrew's comments.
> > > Will run some tests and post them soon.
> > > 
> > > But, before submitting these patches, I think it will be better to
> > > finalize on certain things which might be worth some discussion here:
> > > 
> > > 1) Should the file size change when preallocation is done beyond EOF ?
> > > - Andreas and Chris Wedgwood are in favor of not changing the file size
> > > in this case. I also tend to agree with them. Does anyone has an
> > > argument in favor of changing the filesize ?  If not, I will remove the
> > > code which changes the filesize, before I resubmit the concerned ext4
> > > patch.
> > 
> > I think there needs to be both. If we don't have a mechanism to atomically
> > change the file size with the preallocation, then applications that use
> > stat() to work out if they need to preallocate more space will end up
> > racing.
> 
> By "both" above, do you mean we should give user the flexibility if it wants
> the filesize changed or not ? It can be done by having *two* modes for
> preallocation in the system call - say FA_PREALLOCATE and FA_ALLOCATE. If we
> use FA_PREALLOCATE mode, fallocate() will allocate blocks, but will not
> change the filesize and [cm]time. If FA_ALLOCATE mode is used, fallocate()
> will change the filesize if required (i.e.  when allocation is beyond EOF)
> and also update [cm]time.  This way, the application can decide what it
> wants.

Yes, that's right.

> This will be helpfull for the partial allocation scenario also. Think of the
> case when we do not change the filesize in fallocate() and expect
> applications/posix_fallocate() to do ftruncate() after fallocate() for this.
> Now if fallocate() results in a partial allocation with -ENOSPC error
> returned, applications/posix_fallocate() will not know for what length
> ftruncate() has to be called.  :(

Well, posix_fallocate() either gets all the space or it fails. If
you truncate to extend the file size after an ENOSPC, then that is
a buggy implementation.

The same could be said for any application, or even the fallocate()
call itself if it changes the filesize without having completely
preallocated the space asked....

> Hence it may be a good idea to give user the flexibility if it wants to
> atomically change the file size with preallocation or not. But, with more
> flexibility there comes inconsistency in behavior, which is worth
> considering.

We've got different modes to specify different behaviour. That's
what the mode field was put there for in the first place - the
interface is *designed* to support different preallocation
behaviours....

> > > 2) For FA_UNALLOCATE mode, should the file system allow unallocation of
> > > normal (non-preallocated) blocks (blocks allocated via regular
> > > write/truncate operations) also (i.e. work as punch()) ?
> > 
> > Yes. That is the current XFS implementation for XFS_IOC_UNRESVSP, and what
> > i did for FA_UNALLOCATE as well.
> 
> Ok. But, some people may not expect/like this. I think, we can keep it on
> the backburner for a while, till other issues are sorted out.

How can it be a "backburner" issue when it defines the
implementation?  I've already implemented some thing in XFS that
sort of does what I think that the interface is supposed to do, but
I need that interface to be nailed down before proceeding any
further.

All I'm really interested in right now is that the fallocate
_interface_ can be used as a *complete replacement* for the
pre-existing XFS-specific ioctls that are already used by
applications.  What ext4 can or can't do right now is irrelevant to
this discussion - the interface definition needs to take priority
over implementation....

Cheers,

Dave,
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux