Re: hole punching performance

"Bradley C. Kuszmaul" <kuszmaul@xxxxxxxxx> · Thu, 3 Jan 2013 13:25:48 -0500

Thanks Dave, this is very helpful information.  I have a much better
sense of what the benchmark (e.g., for regression testing).

-Bradley

On Thu, Jan 3, 2013 at 12:51 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Wed, Jan 02, 2013 at 08:45:22PM -0500, Bradley C. Kuszmaul wrote:
>> Thanks for the help.  I got results similar to yours.  However, the
>> hole punching is much faster if you create the file with fallocate
>> than if you actually write some data into it.
>>  fallocate and then hole-punch is about 1us per hole punch.
>>  write and then hole-punch is about 90us per hole punch.
>
> No surprise - after a write the hole punch has a lot more to do.
> I modified the test program to not use O_TRUNC, then ran:
>
> $ /usr/sbin/xfs_io -f -c "truncate 0" -c "pwrite -b 1m 0 20g" /mnt/scratch/blah
> wrote 21474836480/21474836480 bytes at offset 0
> 20.000 GiB, 20480 ops; 0:00:30.00 (675.049 MiB/sec and 675.0491 ops/sec)
> $ sync
> $ time ./a.out
>
> real    0m1.664s
> user    0m0.000s
> sys     0m1.656s
> $
>
> Why? perf top indicates that pretty quickly:
>
>  12.80%  [kernel]  [k] free_hot_cold_page
>  10.62%  [kernel]  [k] block_invalidatepage
>  10.62%  [kernel]  [k] _raw_spin_unlock_irq
>   8.35%  [kernel]  [k] kmem_cache_free
>   6.07%  [kernel]  [k] _raw_spin_unlock_irqrestore
>   3.65%  [kernel]  [k] put_page
>   3.51%  [kernel]  [k] __wake_up_bit
>   3.27%  [kernel]  [k] find_get_pages
>   2.84%  [kernel]  [k] get_pageblock_flags_group
>   2.66%  [kernel]  [k] cancel_dirty_page
>   2.09%  [kernel]  [k] truncate_inode_pages_range
>
> The page cache has to have holes punched in it after the write. So,
> lets rule that out by discarding it separately, and see just what
> the extent manipulation overhead is:
>
> $ rm -f /mnt/scratch/blah
> $ /usr/sbin/xfs_io -f -c "truncate 0" -c "pwrite -b 1m 0 20g" /mnt/scratch/blah
> wrote 21474836480/21474836480 bytes at offset 0
> 20.000 GiB, 20480 ops; 0:00:27.00 (749.381 MiB/sec and 749.3807 ops/sec)
> $ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
> $ time ./a.out
>
> real    0m0.347s
> user    0m0.000s
> sys     0m0.332s
> $
>
> Which is the same as the fallocate/punch method gives....
>
>> But 90us is likely to be plenty fast, so it's looking good.  ( I'll
>> try to track down why my other program was slow.)
>
> If you open the file O_SYNC or O_DSYNC, then you'll still get
> synchronous behaviour....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs