Re: [PATCH 0/2] ext4: Fix performance regression with mballoc

Stefan Wahren <stefan.wahren@xxxxxxxx> · Wed, 24 Aug 2022 23:24:43 +0200

Hi Jan,

Am 24.08.22 um 12:40 schrieb Jan Kara:
Hi Stefan!

On Wed 24-08-22 12:17:14, Stefan Wahren wrote:
Am 23.08.22 um 22:15 schrieb Jan Kara:
Hello,

So I have implemented mballoc improvements to avoid spreading allocations
even with mb_optimize_scan=1. It fixes the performance regression I was able
to reproduce with reaim on my test machine:

                       mb_optimize_scan=0     mb_optimize_scan=1     patched
Hmean     disk-1       2076.12 (   0.00%)     2099.37 (   1.12%)     2032.52 (  -2.10%)
Hmean     disk-41     92481.20 (   0.00%)    83787.47 *  -9.40%*    90308.37 (  -2.35%)
Hmean     disk-81    155073.39 (   0.00%)   135527.05 * -12.60%*   154285.71 (  -0.51%)
Hmean     disk-121   185109.64 (   0.00%)   166284.93 * -10.17%*   185298.62 (   0.10%)
Hmean     disk-161   229890.53 (   0.00%)   207563.39 *  -9.71%*   232883.32 *   1.30%*
Hmean     disk-201   223333.33 (   0.00%)   203235.59 *  -9.00%*   221446.93 (  -0.84%)
Hmean     disk-241   235735.25 (   0.00%)   217705.51 *  -7.65%*   239483.27 *   1.59%*
Hmean     disk-281   266772.15 (   0.00%)   241132.72 *  -9.61%*   263108.62 (  -1.37%)
Hmean     disk-321   265435.50 (   0.00%)   245412.84 *  -7.54%*   267277.27 (   0.69%)

Stefan, can you please test whether these patches fix the problem for you as
well? Comments & review welcome.
i tested the whole series against 5.19 and 6.0.0-rc2. In both cases the
update process succeed which is a improvement, but the download + unpack
duration ( ~ 7 minutes ) is not as good as with mb_optimize_scan=0 ( ~ 1
minute ).
OK, thanks for testing! I'll try to check specifically untar whether I can
still see some differences in the IO pattern on my test machine.

i made two iostat output logs during the complete download phase with 
5.19 and your series applied. iostat was running via ssh connection and 
rpi-update via serial console.

First with mb_optimize_scan=0

https://github.com/lategoodbye/mb_optimize_scan_regress/blob/main/5.19_SDCIT_patch_nooptimize_download_success.iostat.log

Second with mb_optimize_scan=1

https://github.com/lategoodbye/mb_optimize_scan_regress/blob/main/5.19_SDCIT_patch_optimize_download_success.iostat.log

Maybe this helps

Unfortuntately i don't have much time this week and next week i'm in
holidays.
No problem.

Just a question, my tests always had MBCACHE=y . Is it possible that the
mb_optimize_scan is counterproductive for MBCACHE in this case?
MBCACHE (despite similar name) is actually related to extended attributes
so it likely has no impact on your workload.

I'm asking because before the download the update script removes the files
from the previous update process which already cause a high load.
Do you mean already the removal step is noticeably slower with
mb_optimize_scan=1? The removal will be modifying directory blocks, inode
table blocks, block & inode bitmaps, and group descriptors. So if block
allocations are more spread (due to mb_optimize_scan=1 used during the
untar), the removal may also take somewhat longer.
Not sure about this, maybe we should concentrate on download / untar phase.
								Honza