Re: ChunkFS - measuring cross-chunk references

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The tool estimates the cross-chunk references from an extt2/3 file
system. It considers a block group as one chunk and calcuates how many
block groups does a file span across. So, the block group size gives
the estimate of chunk size.

The file systems were aged for about 3-4 months on a developers laptop.

Should have given the background before. Below is the explanations for
the tool. Valh and others came up with this idea.

-------------------------
Chunkfs will only work if we have "few" cross-chunk references.  We
can estimate the effect of chunk size on the number of these
references using an existing ext2/3 file system and treating the block
groups as though they are chunks.  The basic idea is that we figure
out what the block group boundaries are and then find out which files
and directories span two or more block groups.

Step 1:
-------

Get a real-world ext2/3 file system. A file system which has been in
use is required. One from a laptop or a server of any sort will do
fine.

Step 2:
-------

Figure out where the block group boundaries are on disk. Two things
are to be known:

1. Which inode numbers are in which block group?
2. Which blocks are in which block group?

At the end of this step we should have a list that looks something like:

Block group 1: Inodes 11-343, blocks 1000-20000
Block group 2: Inodes 344-576, blocks 20000-40000
[...]

Step 3:
-------

For each file, get the inode number and use mapping from step 2 to
figure out which block group it is in.  Now use bmap() on each block
in the file, and find out the block number.  Use mapping from step 2
to figure out which block groups it has data in. For each file, record
the list of all block groups.

For each directory, get the inode number and map that to a block
group. Then get the inode numbers of all entries in the directory
(ignore symlinks) and map them to a block group.  For each directory,
record the list of all block groups.

Step 4:
-------

Count the number of cross-chunk references this file system would
need.  This is done by going through each directory and file, and
adding up the number of block groups it uses MINUS one.  So if a file
was in block groups 3, 7, and 24, then you would add 2 to the total
number of cross-chunk references.  If a file was only in block group
2, then you would add 0 to the total.


On 4/22/07, Amit Gud <gud@xxxxxxx> wrote:
Karuna sagar K wrote:
> Hi,
>
> The attached code contains program to estimate the cross-chunk
> references for ChunkFS file system (idea from Valh). Below are the
> results:
>

Nice to see some numbers! But would be really nice to know:

- what the chunk size is
- how the files were created or, more vaguely, how 'aged' the fs is
- what is the chunk allocation algorithm


Best,
AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud




Thanks,
Karuna
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux