Re: [PATCH, RFC] libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/13/20 5:48 PM, Dave Chinner wrote:
> On Thu, Feb 13, 2020 at 03:12:24PM -0600, Eric Sandeen wrote:
>> I had a request from someone who cared about mkfs speed(!)
>> over a slower network block device to look into using faster
>> zeroing methods, particularly for the log, during mkfs.xfs.
>>
>> e2fsprogs already does this, thanks to some guy named Darrick:
>>
>> /*
>>  * If we know about ZERO_RANGE, try that before we try PUNCH_HOLE because
>>  * ZERO_RANGE doesn't unmap preallocated blocks.  We prefer fallocate because
>>  * it always invalidates page cache, and libext2fs requires that reads after
>>  * ZERO_RANGE return zeroes.
>>  */
>> static int __unix_zeroout(int fd, off_t offset, off_t len)
>> {
>>         int ret = -1;
>>
>> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_ZERO_RANGE)
>>         ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, offset, len);
>>         if (ret == 0)
>>                 return 0;
>> #endif
>> #if defined(HAVE_FALLOCATE) && defined(FALLOC_FL_PUNCH_HOLE) && defined(FALLOC_FL_KEEP_SIZE)
>>         ret = fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>>                         offset,  len);
>>         if (ret == 0)
>>                 return 0;
>> #endif
>>         errno = EOPNOTSUPP;
>>         return ret;
>> }
>>
>> and nobody has exploded so far, AFAIK.  :)  So, floating this idea
>> for xfsprogs.  I'm a little scared of the second #ifdef block above, but
>> if that's really ok/consistent/safe we could add it too.
> 
> If FALLOC_FL_PUNCH_HOLE is defined, then FALLOC_FL_KEEP_SIZE is
> guaranteed to be defined, so that condition check is somewhat
> redundant. See commit 79124f18b335 ("fs: add hole punching to
> fallocate")....

Style aside, is that 2nd zeroing method something we want to try?

>> The patch moves some defines around too, I could split that up and resend
>> if this isn't laughed out of the room.
>>
>> Thanks,
>> -Eric
>>
>> =====
>>
>> libxfs: use FALLOC_FL_ZERO_RANGE in libxfs_device_zero
>>
>> I had a request from someone who cared about mkfs speed(!)
>> over a slower network block device to look into using faster
>> zeroing methods, particularly for the log, during mkfs.
>>
>> Using FALLOC_FL_ZERO_RANGE is faster in this case than writing
>> a bunch of zeros across a wire.
>>
>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
>> ---
>>
>> diff --git a/include/linux.h b/include/linux.h
>> index 8f3c32b0..425badb5 100644
>> --- a/include/linux.h
>> +++ b/include/linux.h
>> @@ -113,6 +113,26 @@ static __inline__ void platform_uuid_copy(uuid_t *dst, uuid_t *src)
>>  	uuid_copy(*dst, *src);
>>  }
>>  
>> +#ifndef FALLOC_FL_PUNCH_HOLE
>> +#define FALLOC_FL_PUNCH_HOLE	0x02
>> +#endif
>> +
>> +#ifndef FALLOC_FL_COLLAPSE_RANGE
>> +#define FALLOC_FL_COLLAPSE_RANGE 0x08
>> +#endif
>> +
>> +#ifndef FALLOC_FL_ZERO_RANGE
>> +#define FALLOC_FL_ZERO_RANGE 0x10
>> +#endif
>> +
>> +#ifndef FALLOC_FL_INSERT_RANGE
>> +#define FALLOC_FL_INSERT_RANGE 0x20
>> +#endif
>> +
>> +#ifndef FALLOC_FL_UNSHARE_RANGE
>> +#define FALLOC_FL_UNSHARE_RANGE 0x40
>> +#endif
> 
> These were added to allow xfs_io to test these operations before
> there was userspace support for them. I do not think we should
> propagate them outside of xfs_io - if they are not supported by the
> distro at build time, we shouldn't attempt to use them in mkfs.
> 
> i.e. xfs_io is test-enablement code for developers, mkfs.xfs is
> production code for users, so different rules kinda exist for
> them...

Fair enough.  people /could/ use newer xfsprogs on older kernels, but...
those defines are getting pretty old by now in any case.

It's probably not terrible to just fail the build on a system that doesn't
have falloc.h, by now?

>> diff --git a/libxfs/Makefile b/libxfs/Makefile
>> index fbcc963a..b4e8864b 100644
>> --- a/libxfs/Makefile
>> +++ b/libxfs/Makefile
>> @@ -105,6 +105,10 @@ CFILES = cache.c \
>>  #
>>  #LCFLAGS +=
>>  
>> +ifeq ($(HAVE_FALLOCATE),yes)
>> +LCFLAGS += -DHAVE_FALLOCATE
>> +endif
> 
> HAVE_FALLOCATE comes from an autoconf test. I suspect that this
> needs to be made more finegrained, testing for fallocate() features,
> not just just whether the syscall exists. And the above should
> probably be in include/builddefs.in so that it's available to all
> of xfsprogs, not just libxfs code...

But that's testing the kernel on the build host, right?  What's the
point of that?  Or am I misunderstanding?

>> +
>>  FCFLAGS = -I.
>>  
>>  LTLIBS = $(LIBPTHREAD) $(LIBRT)
>> diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c
>> index 0d9d7202..94f63bbf 100644
>> --- a/libxfs/rdwr.c
>> +++ b/libxfs/rdwr.c
>> @@ -4,6 +4,9 @@
>>   * All Rights Reserved.
>>   */
>>  
>> +#if defined(HAVE_FALLOCATE)
>> +#include <linux/falloc.h>
>> +#endif
> 
> Urk. That should be in include/linux.h, right?
> 
>>  
>>  #include "libxfs_priv.h"
>>  #include "init.h"
>> @@ -60,9 +63,21 @@ int
>>  libxfs_device_zero(struct xfs_buftarg *btp, xfs_daddr_t start, uint len)
>>  {
>>  	xfs_off_t	start_offset, end_offset, offset;
>> -	ssize_t		zsize, bytes;
>> +	ssize_t		zsize, bytes, len_bytes;
>>  	char		*z;
>> -	int		fd;
>> +	int		ret, fd;
>> +
>> +	fd = libxfs_device_to_fd(btp->dev);
>> +	start_offset = LIBXFS_BBTOOFF64(start);
>> +	end_offset = LIBXFS_BBTOOFF64(start + len) - start_offset;
>> +
>> +#if defined(HAVE_FALLOCATE)
>> +	/* try to use special zeroing methods, fall back to writes if needed */
>> +	len_bytes = LIBXFS_BBTOOFF64(len);
>> +	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
>> +	if (ret == 0)
>> +		return 0;
>> +#endif
> 
> I kinda dislike the "return if success" hidden inside the ifdef -
> it's not a code pattern I'd expect to see. This is what I'd tend
> to expect in include/linux.h:
> 
> #if defined(HAVE_FALLOCATE_ZERO_RANGE)
> static inline int
> platform_zero_range(
> 	int		fd,
> 	xfs_off_t	start_offset,
> 	xfs_off_t	end_offset)
> {
> 	int		ret;
> 
> 	ret = fallocate(fd, FALLOC_FL_ZERO_RANGE, start_offset, len_bytes);
> 	if (!ret)
> 		return 0;
> 	return -errno;
> }
> #else
> #define platform_zero_range(fd, s, o)	(true)
> #endif
> 
> and then the code in libxfs_device_zero() does:
> 
> 	error = platform_zero_range(fd, start_offset, len_bytes);
> 	if (!error)
> 		return 0;
> 
> without adding nasty #defines...

Yeah that's much better.  Tho I hate adding more platform_* stuff when really
we're down to only one platform but maybe that's a project for another day.

Thanks for the review,
-Eric



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux