Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 17, 2013 at 10:58:35AM +0200, Lukáš Czerner wrote:
>On Sun, 16 Jun 2013, Dave Chinner wrote:
>
>> Date: Sun, 16 Jun 2013 10:55:33 +1000
>> From: Dave Chinner <david@xxxxxxxxxxxxx>
>> To: Radek Pazdera <rpazdera@xxxxxxxxxx>
>> Cc: linux-ext4@xxxxxxxxxxxxxxx, lczerner@xxxxxxxxxx, kasparek@xxxxxxxxxxxx
>> Subject: Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index
>> 
>> On Sat, May 04, 2013 at 11:28:33PM +0200, Radek Pazdera wrote:
>> > Hello everyone,
>> > 
>> > I am an university student from Brno /CZE/. I decided to try to optimise
>> > the readdir/stat scenario in ext4 as the final project to school. I
>> > posted some test results I got few months ago [1].
>> > 
>> > I tried to implement an additional tree for ext4's directory index
>> > that would be sorted by inode numbers. The tree then would be used
>> > by ext4_readdir() which should lead to substantial increase of
>> > performance of operations that manipulate a whole directory at once.
>> > 
>> > The performance increase should be visible especially with large
>> > directories or in case of low memory or cache pressure.
>> > 
>> > This patch series is what I've got so far. I must say, I originally
>> > thought it would be *much* simpler :).
>> ....
>> > BENCHMARKS
>> > ==========
>> > 
>> > I did some benchmarks and compared the performance with ext4/htree,
>> > XFS, and btrfs up to 5 000 000 of files in a single directory. Not
>> > all of them are done though (they run for days).
>> 
>> Just a note that for users that have this sort of workload on XFS,
>> it is generally recommended that they increase the directory block
>> size to 8-16k (from the default of 4k). The saddle point where 8-16k
>> directory blocks tends to perform better than 4k directory blocks is
>> around the 2-3 million file point....
>> 
>> Further, if you are doing random operations on such directories,
>> then increasing it to the maximum of 64k is recommended. This
>> greatly reduces the IO overhead of directory manipulations by making
>> the trees widers and shallower. i.e. we recommend trading off CPU
>> and memory for lower IO overhead and better layout on disk as it's
>> layout and IO that are the performance limiting factors for large
>> directories. :)

Hi Dave,

Thank you for pointing that out, I was not aware of that. I know that
the 5M tests may be a bit too extreme. I thought it might be interesting
to see what happens.

>> > Full results are available here:
>> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/ext4-5M/
>> 
>> Can you publish the scripts you used so we can try to reproduce
>> your results?
>
>Hi Dave,
>
>IIRC the tests used to generate the results should be found here:
>
>https://github.com/astro-/dir-index-test
>
>however I am not entirely sure whether the github repository is kept
>up-to-date. Radek can you confirm ?

Lukas is right, these are the scripts I used to get the results above
and they're up-to-date.

If you'd like to run the tests, there are some parameters you will
probably need to adjust in the run_tests.sh file. Namely:

DEVICE       - that's the testing device
DROP_OFF_DIR - this is a scratch dir for the copy test, which should
               reside on a separate device
RESULTS_DIR  - this is where you want your graphs to be stored
FILESYSTEMS  - ext4, btrfs, jfs or xfs. If you would like to change the
               parameters of mkfs, you can do it here:

               https://github.com/astro-/dir-index-test/blob/master/scripts/prepfs.sh

FSIZES       - the size of each file in the directory (if you provide a
               list of values, the tests will be run multiple times with
               different file sizes)

TEST_CASES   - the readdir-stat and getdents-stat are just isolated
               directory traversals (they are written in C)

               https://github.com/astro-/dir-index-test/blob/master/src/readdir-stat.c
               https://github.com/astro-/dir-index-test/blob/master/src/getdents-stat.c

               The other tests are here:

               https://github.com/astro-/dir-index-test/tree/master/tests

DIR_TYPE     - clean or dirty (you will probably be interested in the
               "dirty" type of tests). The difference can be seen here
               (the create_clean_dir and create_dirty_dir functions):
               https://github.com/astro-/dir-index-test/blob/master/scripts/create_files.py

DIR_SIZES    - you can put a list of values here

To be able to run the tests properly, you need to have gnuplot installed.

If you have any questions or problems, please, let me know :).

Cheers,
Radek

>-Lukas
>
>> 
>> > I also did some tests on an aged file system (I used the simple 0.8
>> > chance to create, 0.2 to delete a file) where the results of ext4
>> > with itree are much better even than xfs, which gets fragmented:
>> > 
>> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/cp.png
>> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/readdir-stat.png
>> 
>> This XFS result is of interest to me here - it shouldn't degrade
>> like that, so having the script to be able to reproduce it locally
>> would be helpful to me. Indeed, I posted a simple patch yesterday
>> that significantly improves XFS performance on a similar small file
>> create workload:
>> 
>> http://marc.info/?l=linux-fsdevel&m=137126465712701&w=2
>> 
>> That writeback plugging change should benefit ext4 as well in these
>> workloads....
>> 
>> Cheers,
>> 
>> Dave.
>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux