Re: [PATCH 3/5] vfs: add a zero-initialization mode to fallocate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 22, 2021 at 03:49:31PM +1000, Dave Chinner wrote:
> On Tue, Sep 21, 2021 at 09:13:54PM -0700, Darrick J. Wong wrote:
> > On Wed, Sep 22, 2021 at 01:59:07PM +1000, Dave Chinner wrote:
> > > On Tue, Sep 21, 2021 at 07:38:01PM -0700, Darrick J. Wong wrote:
> > > > On Tue, Sep 21, 2021 at 07:16:26PM -0700, Dan Williams wrote:
> > > > > On Tue, Sep 21, 2021 at 1:32 AM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Tue, Sep 21, 2021 at 10:44:31AM +1000, Dave Chinner wrote:
> > > > > > > I think this wants to be a behavioural modifier for existing
> > > > > > > operations rather than an operation unto itself. i.e. similar to how
> > > > > > > KEEP_SIZE modifies ALLOC behaviour but doesn't fundamentally alter
> > > > > > > the guarantees ALLOC provides userspace.
> > > > > > >
> > > > > > > In this case, the change of behaviour over ZERO_RANGE is that we
> > > > > > > want physical zeros to be written instead of the filesystem
> > > > > > > optimising away the physical zeros by manipulating the layout
> > > > > > > of the file.
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > Then we have and API that looks like:
> > > > > > >
> > > > > > >       ALLOC           - allocate space efficiently
> > > > > > >       ALLOC | INIT    - allocate space by writing zeros to it
> > > > > > >       ZERO            - zero data and preallocate space efficiently
> > > > > > >       ZERO | INIT     - zero range by writing zeros to it
> > > > > > >
> > > > > > > Which seems to cater for all the cases I know of where physically
> > > > > > > writing zeros instead of allocating unwritten extents is the
> > > > > > > preferred behaviour of fallocate()....
> > > > > >
> > > > > > Agreed.  I'm not sure INIT is really the right name, but I can't come
> > > > > > up with a better idea offhand.
> > > > > 
> > > > > FUA? As in, this is a forced-unit-access zeroing all the way to media
> > > > > bypassing any mechanisms to emulate zero-filled payloads on future
> > > > > reads.
> > > 
> > > Yes, that's the semantic we want, but FUA already defines specific
> > > data integrity behaviour in the storage stack w.r.t. volatile
> > > caches.
> > > 
> > > Also, FUA is associated with devices - it's low level storage jargon
> > > and so is not really appropriate to call a user interface operation
> > > FUA where users have no idea what a "unit" or "access" actually
> > > means.
> > > 
> > > Hence we should not overload this name with some other operation
> > > that does not have (and should not have) explicit data integrity
> > > requirements. That will just cause confusion for everyone.
> > > 
> > > > FALLOC_FL_ZERO_EXISTING, because you want to zero the storage that
> > > > already exists at that file range?
> > > 
> > > IMO that doesn't work as a behavioural modifier for ALLOC because
> > > the ALLOC semantics are explicitly "don't touch existing user
> > > data"...
> > 
> > Well since you can't preallocate /and/ zerorange at the same time...
> > 
> > /* For FALLOC_FL_ZERO_RANGE, write zeroes to pre-existing mapped storage. */
> > #define FALLOC_FL_ZERO_EXISTING		(0x80)
> 
> Except we also want the newly allocated regions (i.e. where holes
> were) in that range being zeroed to have zeroes written to them as
> well, yes? Otherwise we end up with a combination of unwritten
> extents and physical zeroes, and you can't use
> ZERORANGE|EXISTING as a replacement for PUNCH + ALLOC|INIT
> 
> /*
>  * For preallocation and zeroing operations, force the filesystem to
>  * write zeroes rather than use unwritten extents to indicate the
>  * range contains zeroes.
>  *
>  * For filesystems that support unwritten extents, this trades off
>  * slow fallocate performance for faster first write performance as
>  * unwritten extent conversion on the first write to each block in
>  * the range is not needed.
>  *
>  * Care is required when using FALLOC_FL_ALLOC_INIT_DATA as it will
>  * be much slower overall for large ranges and/or slow storage
>  * compared to using unwritten extents.
>  */
> #define FALLOC_FL_ALLOC_INIT_DATA	(1 << 7)

I prefer FALLOC_FL_ZEROINIT_DATA here, because in the ZERO|INIT case
we're not allocating any new space, merely rewriting existing storage.
I also want to expand the description slightly:

/*
 * For preallocation, force the filesystem to write zeroes rather than
 * use unwritten extents to indicate the range contains zeroes.  For
 * zeroing operations, force the filesystem to write zeroes to existing
 * written extents.

--D

> 
> Cheers,
> 
> Dave.
> 
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux