Re: Ext4: Slow performance on first write after mount

Andreas Dilger <adilger@xxxxxxxxx> · Mon, 20 May 2013 00:39:50 -0600

On 2013-05-19, at 8:00, Theodore Ts'o <tytso@xxxxxxx> wrote:
> On Fri, May 17, 2013 at 06:51:23PM +0200, frankcmoeller@xxxxxxxx wrote:
>> - Why do you throw away buffer cache and don't store it on disk during umount? The initialization of the buffer cache is quite awful for application which need a specific write throughput.
>> - A workaround would be to read whole /proc/.../mb_groups file right after every mount. Correct?
> 
> Simply adding "cat /proc/fs/<dev>/mb_groups > /dev/null" to one of the
> /etc/init.d scripts, or to /etc/rc.local is probably the simplest fix,
> yes.
> 
>> - I can try to add a mount option to initialize the cache at mount time. Would you be interested in such a patch?
> 
> Given the simple nature of the above workaround, it's not obvious to
> me that trying to make file system format changes, or even adding a
> new mount option, is really worth it.  This is especially true given
> that mount -a is sequential so if there are a large number of big file
> systems, using this as a mount option would be slow down the boot
> significantly.  It would be better to do this parallel, which you
> could do in userspace much more easily using the "cat
> /proc/fs/<dev>/mb_groups" workaround.

Since we already have a thread starting at mount time to check the
inode table zeroing, it would also be possible to co-opt this thread
for preloading the group metadata from the bitmaps. 

>> - I can see (see debug output) that the call of ext4_wait_block_bitmap in mballoc.c line 848 takes during buffer cache initialization the longest time (some 1/100 of a second). Can this be improved?
> 
> The delay is caused purely by I/O delay, so short of replacing the HDD
> with a SSD, not really....

Well, with a larger flex_bg factor at format time there will be more
bitmaps allocated together on disk, so fewer seeks needed to load
them after a new mount. We use a flex_bg factor of 256 for this
reason on our very large storage targets.

Cheers, Andreas--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html