Re: [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 13, 2020 at 03:12:24PM -0600, Eric Sandeen wrote:
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.xfs.
> 
> e2fsprogs already does this, thanks to some guy named Darrick:
> 
> /*
>  * If we know about ZERO_RANGE, try that before we try PUNCH_HOLE because
>  * ZERO_RANGE doesn't unmap preallocated blocks.  We prefer fallocate because
>  * it always invalidates page cache, and libext2fs requires that reads after
>  * ZERO_RANGE return zeroes.
>  */
> static int __unix_zeroout(int fd, off_t offset, off_t len)
> {
>         int ret = -1;
> 
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
>         ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, offset, len);
>         if (ret == 0)
>                 return 0;
> #endif
> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
>         ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>                         offset,  len);
>         if (ret == 0)
>                 return 0;
> #endif
>         errno = EOPNOTSUPP;
>         return ret;
> }
> 
> and nobody has exploded so far, AFAIK.  :)  So, floating this idea
> for xfsprogs.  I'm a little scared of the second #ifdef block above, but
> if that's really ok/consistent/safe we could add it too.

If FALLOC_FL_PUNCH_HOLE is defined, then FALLOC_FL_KEEP_SIZE is
guaranteed to be defined, so that condition check is somewhat
redundant. See commit 79124f18b335 ("fs: add hole punching to
fallocate")....

> The patch moves some defines around too, I could split that up and resend
> if this isn't laughed out of the room.
> 
> Thanks,
> -Eric
> 
> =====
> 
> libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
> 
> I had a request from someone who cared about mkfs speed(!)
> over a slower network block device to look into using faster
> zeroing methods, particularly for the log, during mkfs.
> 
> Using FALLOC_FL_ZERO_RANGE is faster in this case than writing
> a bunch of zeros across a wire.
> 
> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> ---
> 
> diff --git a/include/linux.h b/include/linux.h
> index 8f3c32b0..425badb5 100644
> --- a/include/linux.h
> +++ b/include/linux.h
> @@ -113,6 +113,26 @@ static __inline__ void platform_uuid_copy(uuid_t *dst, uuid_t *src)
>  	uuid_copy(*dst, *src);
>  }
>  
> +#ifndef FALLOC_FL_PUNCH_HOLE
> +#define FALLOC_FL_PUNCH_HOLE	0x02
> +#endif
> +
> +#ifndef FALLOC_FL_COLLAPSE_RANGE
> +#define FALLOC_FL_COLLAPSE_RANGE 0x08
> +#endif
> +
> +#ifndef FALLOC_FL_ZERO_RANGE
> +#define FALLOC_FL_ZERO_RANGE 0x10
> +#endif
> +
> +#ifndef FALLOC_FL_INSERT_RANGE
> +#define FALLOC_FL_INSERT_RANGE 0x20
> +#endif
> +
> +#ifndef FALLOC_FL_UNSHARE_RANGE
> +#define FALLOC_FL_UNSHARE_RANGE 0x40
> +#endif

These were added to allow xfs_io to test these operations before
there was userspace support for them. I do not think we should
propagate them outside of xfs_io - if they are not supported by the
distro at build time, we shouldn't attempt to use them in mkfs.

i.e. xfs_io is test-enablement code for developers, mkfs.xfs is
production code for users, so different rules kinda exist for
them...

> diff --git a/libxfs/Makefile b/libxfs/Makefile
> index fbcc963a..b4e8864b 100644
> --- a/libxfs/Makefile
> +++ b/libxfs/Makefile
> @@ -105,6 +105,10 @@ CFILES = cache.c \
>  #
>  #LCFLAGS +=
>  
> +ifeq ($(HAVE_FALLOCATE),yes)
> +LCFLAGS += -DHAVE_FALLOCATE
> +endif

HAVE_FALLOCATE comes from an autoconf test. I suspect that this
needs to be made more finegrained, testing for fallocate() features,
not just just whether the syscall exists. And the above should
probably be in include/builddefs.in so that it's available to all
of xfsprogs, not just libxfs code...

> +
>  FCFLAGS = -I.
>  
>  LTLIBS = $(LIBPTHREAD) $(LIBRT)
> diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
> index 0d9d7202..94f63bbf 100644
> --- a/libxfs/rdwr.c
> +++ b/libxfs/rdwr.c
> @@ -4,6 +4,9 @@
>   * All Rights Reserved.
>   */
>  
> +#if defined(HAVE_FALLOCATE)
> +#include <linux/falloc.h>
> +#endif

Urk. That should be in include/linux.h, right?

>  
>  #include "libxfs_priv.h"
>  #include "init.h"
> @@ -60,9 +63,21 @@ int
>  libxfs_device_zero(struct xfs_buftarg *btp, xfs_daddr_t start, uint len)
>  {
>  	xfs_off_t	start_offset, end_offset, offset;
> -	ssize_t		zsize, bytes;
> +	ssize_t		zsize, bytes, len_bytes;
>  	char		*z;
> -	int		fd;
> +	int		ret, fd;
> +
> +	fd = libxfs_device_to_fd(btp->dev);
> +	start_offset = LIBXFS_BBTOOFF64(start);
> +	end_offset = LIBXFS_BBTOOFF64(start + len) - start_offset;
> +
> +#if defined(HAVE_FALLOCATE)
> +	/* try to use special zeroing methods, fall back to writes if needed */
> +	len_bytes = LIBXFS_BBTOOFF64(len);
> +	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
> +	if (ret == 0)
> +		return 0;
> +#endif

I kinda dislike the "return if success" hidden inside the ifdef -
it's not a code pattern I'd expect to see. This is what I'd tend
to expect in include/linux.h:

#if defined(HAVE_FALLOCATE_ZERO_RANGE)
static inline int
platform_zero_range(
	int		fd,
	xfs_off_t	start_offset,
	xfs_off_t	end_offset)
{
	int		ret;

	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
	if (!ret)
		return 0;
	return -errno;
}
#else
#define platform_zero_range(fd, s, o)	(true)
#endif

and then the code in libxfs_device_zero() does:

	error = platform_zero_range(fd, start_offset, len_bytes);
	if (!error)
		return 0;

without adding nasty #defines...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux