Hi Enrico, Am Montag, dem 08.07.2024 um 12:22 +0200 schrieb Enrico Scholz: > Hello, > > I have a karo tx6s module (imx6s, 512 MiB RAM) which is shipped with an > ancient u-boot 2015 bootloader. > > barebox 2024.07 works out-of-the box on it. But under the booted linux > system a see a major regression in memory performance. > > E.g. u-boot has > > > # hdparm -tT /dev/mmcblk3 > > Timing cached reads: 1236 MB in 2.00 seconds = 618.46 MB/sec > > while barebox shows only > > > Timing cached reads: 574 MB in 2.00 seconds = 287.08 MB/sec > > > Running tinymembench[1] shows that pure memory read operations are not > affected; e.g. both variants report around > > > NEON read : 1398.5 MB/s > > > But write operations differ by a factor of 4-5: > > > standard memset : 2054.4 MB/s > > on u-boot vs. barebox with > > > standard memset : 472.7 MB/s > > > I modified barebox to use the same DCD like u-boot; resulting MMDC > registers are nearly identical[2]. /sys/kernel/debug/clk/clk_summary > is also nearly the same (only LVDS1_SEL (unused) has another parent). > TZASC is not used. GPRx registers are identical. > > Systems are running with linux 6.6 and master on an initrd. > > Disabling L2 cache in linux slows down things, but the relative results > are similar (no difference in read, memset 322.3 MB/s -> 728.5 MB/s). > > Building barebox with CONFIG_MMU disabled makes no difference. > > > Looking at another iMX6 system shows similar bad numbers for barebox. > E.g. an iMX6QP has a memset rate of 613.6 MB/s. But I do not have > u-boot available for comparision. > > > What could be the reason the u-boot is so much faster? Which memory > related settings are carried over from the bootloader to linux? What > could I test else? The most likely cause is that Barebox applies the workaround for ARM erratum 845369, which has a major impact on streaming writes and thus both memset and memcpy performance. The old U-Boot probably does not include this workaround. You may check this theory by removing the call to enable_arm_errata_845369_war in imx6_cpu_lowlevel_init. Regards, Lucas