Theodore Tso wrote:
So I started looking to see how we might be able to improve mballoc to
avoid freespace fragmentation, and I came up with the following high
level design. Does this look sane? Have I overlooked anything?
1) In ext4_mb_normalize_request(), if the inode that we are allocating
does not have any open file descriptors for write (i.e., it's already
closed and we're allocating via delalloc) _and_ the inode was
previously opened with O_CREAT and without O_APPEND (checked via a
flag in EXT4_I(inode)), then do not normalize the size to a power of
two, but rather to the filesystem blocksize.
The idea here is that we should be trying to find an exact fit, since
most of the time (except for log files, which get appended; hence the
O_CREAT && !O_APPEND test) once a file is written, that is probably
the final size for the file. So normalizing the size for the
preallocation area to a power of two will be counterproductive for
most files.
I am trying to understand what user cases prefer normalize allocation
request size? If they are uncommon cases, perhaps
we should disable the normalize the allocation size disabled by default,
unless the apps opens the files with O_APPEND?
2) If the there has been less than X files opened in Y jiffies the
parent directory (using the dentry path used to open the file), then
do not set EXT4_MB_HINT_GROUP_ALLOC in ext4_mb_group_or_file(). We
can simulate this for without creating this patch to test #1 by
setting mb_stream_request to 0 (which should completely disable group
preallocation).
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html