Major memory performance decline from u-boot to barebox
Lucas Stach
l.stach at pengutronix.de
Mon Jul 8 03:53:20 PDT 2024
Hi Enrico,
Am Montag, dem 08.07.2024 um 12:22 +0200 schrieb Enrico Scholz:
> Hello,
>
> I have a karo tx6s module (imx6s, 512 MiB RAM) which is shipped with an
> ancient u-boot 2015 bootloader.
>
> barebox 2024.07 works out-of-the box on it. But under the booted linux
> system a see a major regression in memory performance.
>
> E.g. u-boot has
>
> > # hdparm -tT /dev/mmcblk3
> > Timing cached reads: 1236 MB in 2.00 seconds = 618.46 MB/sec
>
> while barebox shows only
>
> > Timing cached reads: 574 MB in 2.00 seconds = 287.08 MB/sec
>
>
> Running tinymembench[1] shows that pure memory read operations are not
> affected; e.g. both variants report around
>
> > NEON read : 1398.5 MB/s
>
>
> But write operations differ by a factor of 4-5:
>
> > standard memset : 2054.4 MB/s
>
> on u-boot vs. barebox with
>
> > standard memset : 472.7 MB/s
>
>
> I modified barebox to use the same DCD like u-boot; resulting MMDC
> registers are nearly identical[2]. /sys/kernel/debug/clk/clk_summary
> is also nearly the same (only LVDS1_SEL (unused) has another parent).
> TZASC is not used. GPRx registers are identical.
>
> Systems are running with linux 6.6 and master on an initrd.
>
> Disabling L2 cache in linux slows down things, but the relative results
> are similar (no difference in read, memset 322.3 MB/s -> 728.5 MB/s).
>
> Building barebox with CONFIG_MMU disabled makes no difference.
>
>
> Looking at another iMX6 system shows similar bad numbers for barebox.
> E.g. an iMX6QP has a memset rate of 613.6 MB/s. But I do not have
> u-boot available for comparision.
>
>
> What could be the reason the u-boot is so much faster? Which memory
> related settings are carried over from the bootloader to linux? What
> could I test else?
The most likely cause is that Barebox applies the workaround for ARM
erratum 845369, which has a major impact on streaming writes and thus
both memset and memcpy performance. The old U-Boot probably does not
include this workaround.
You may check this theory by removing the call to
enable_arm_errata_845369_war in imx6_cpu_lowlevel_init.
Regards,
Lucas
More information about the barebox
mailing list