Re: hole punching performance

"Bradley C. Kuszmaul" <kuszmaul@xxxxxxxxx> · Wed, 2 Jan 2013 20:45:22 -0500

Thanks for the help.  I got results similar to yours.  However, the
hole punching is much faster if you create the file with fallocate
than if you actually write some data into it.
 fallocate and then hole-punch is about 1us per hole punch.
 write and then hole-punch is about 90us per hole punch.

But 90us is likely to be plenty fast, so it's looking good.  ( I'll
try to track down why my other program was slow.)

I'll also look into the multithreaded performance and report back...

-Bradley

On Wed, Jan 2, 2013 at 6:27 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Wed, Jan 02, 2013 at 04:51:07PM -0500, Bradley C. Kuszmaul wrote:
>> If I use hole-punching, what will happen to the performance of my application?
>>
>> I have a multithreaded application that creates large files (many
>> gigabytes per file).  The application sometimes wants to punch holes
>> (say 1 megabyte in size).
>>
>> On Redhat 6, I've measured that punching holes requires about 2ms
>
> What version of RHEL 6.x? On x <= 1, hole punching is a synchronous
> transaction.  On x >= 2, it is an asynchronous transaction and so is
> much, much faster.
>
>> (this with a battery-backed up RAID controller), which is slower than
>> I was hoping for, but it's probably OK.  The throughput is only about
>> 2ms per hole-punch even if I have lots of threads punching holes in
>> lots of different files at the same time.
>
> That sounds like synchronous transaction behaviour.
>
> A current 3.8-rc1 kernel does a hole punch in well under 2ms. Here's
> 10,000 hole punches being done in ~300ms:
>
> $ cat t.c
> #define _GNU_SOURCE
> #include <unistd.h>
> #include <fcntl.h>
> #include <linux/falloc.h>
> #include <xfs/xfs.h>
>
> int main(int argc, char *argv[])
> {
>         int i, fd;
>         fd = open("/mnt/scratch/blah", O_CREAT|O_TRUNC|O_RDWR, 0777);
>         perror("open");
>         fallocate(fd, 0, 0, 20 * 1024 * 1024 * 1024LL);
>         for (i = 0; i < 10000; i++) {
>                 // fallocate(fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, i * 8192, 4096);
>                 struct xfs_flock64      l = {0};
>
>                 l.l_whence = SEEK_SET;
>                 l.l_start = i * 8192;
>                 l.l_len = 4096;
>
>                 ioctl(fd, XFS_IOC_UNRESVSP, &l);
>         }
>         close(fd);
> }
> dave@test-4:~$ gcc -O2 t.c
> dave@test-4:~$ rm -f /mnt/scratch/blah
> dave@test-4:~$ time ./a.out
> open: Success
>
> real    0m0.336s
> user    0m0.000s
> sys     0m0.336s
> dave@test-4:~$
>
> So that means roughly 300ms/10000 = 30uS per hole punch call.
> I get the same result with fallocate or XFS_IOC_UNRESVSP, and I get
> the same result on RHEL 6.2+.
>
>> The question I have:  What will happen to the performance of other
>> threads doing read() and write() operations?  Will hole-punching slow
>> down the other read() and write() operations running in other threads?
>
> That all depends. Hole punching is serialised the same way as
> truncation - all concurrent operations to the same file are locked
> out while the hole punch is performed. Operations to other files
> will unaffected unless they are trying to allocate or free extents
> in the same allocation group, or you are running a kernel that does
> synchronous transactions and the other operations serialise on the
> synchronous transaction commits...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs