[RFC PATCH] arm: i.MX6 Cortex-A9: Fix memory ordering inconsistency by disabling prefetch instructions
Mathieu Desnoyers
mathieu.desnoyers at efficios.com
Tue Jul 19 08:27:49 PDT 2022
----- On Jul 19, 2022, at 10:33 AM, Arnd Bergmann arnd at arndb.de wrote:
> On Mon, Jul 18, 2022 at 4:51 PM Mathieu Desnoyers
> <mathieu.desnoyers at efficios.com> wrote:
>
>> Request for Feedback
>> ====================
>>
>> This fix targets all i.MX configurations, but it is likely too broad (or
>> too narrow). It would be great if people with access to different
>> Freescale i.MX test boards, and test boards from other vendors, could try
>> to reproduce the issue to figure out what would be the right scope for
>> this fix.
>>
>> It would also be great if people with knowledge of the ARM CPU internals
>> could help understanding whether this fix really fixes an issue between
>> prefetch and memory barriers, or just happens to hide the issue. It
>> would be good to understand whether this issue only affects PLDW or if
>> it also affects the PLD instruction.
>
> I don't have any relevant hardware at hand, but looked at this for a few
> hours today, unfortunately without any notable success. Just documenting
> what I did here:
>
> - looked at the errata lists for cortex-a9 r2, for pl310 and for
> imxq6q to see if
> anything stuck out. I assume you've already done the same, but I can confirm
> that the errata that would match the symptom are listed as fixed in r2p10
> or earlier.
Yes, I've spent some quality time reading through those errata in the past 2 weeks,
and did not find anything relevant for the r2p10.
>
> - looked at objdump output from
> linux-image-5.18.0-0.bpo.1-armmp_5.18.2-1~bpo11+1_armhf.deb
> (not the same version, but hopefully be close enough), and compared that
> to v5.18.2 built with the same config using gcc-7.5 and gcc-10.3 to
> see if I could tell what is different. The output looks very similar, though
> my own gcc-10 apparently fails to inline arch_futex_atomic_op_inuser()
> and futex_atomic_op_inuser(). This looks like something we may want
> to force-inline in principle, but it seems unrelated to the bug you found
> since the debian vmlinux has these functions inlined and I don't think
> they are actually part of the broken code path.
Indeed, those op_inuser did not appear to be used in FUTEX_WAKE, FUTEX_WAIT
AFAIR, so I don't think the delta is relevant here.
>
> - looked for other quad-core Cortex-A9 SoCs to find someone with a
> similar revision to check if they have the same bug. The closest I
> can think of is the OMAP4 that uses an A9 r1p2.
Good to know.
>
> - Looked at the disabled errata handling in arch/arm/Kconfig.
> Unfortunately a couple of the workarounds we have there are
> now always disabled because of a dependency on
> ARCH_MULTIPLATFORM. It's a long shot, but you could try
> removing the dependencies and enabling all the Cortex-A9
> fixes like ARM_ERRATA_742230, ARM_ERRATA_742231,
> ARM_ERRATA_743622, ARM_ERRATA_751472, and
> ARM_ERRATA_754327.
I already attempted this, but ended up understanding that
handling of those errata workarounds were simply moved to U-Boot,
so it can set the relevant bits in the Diagnostic Control Register
at boot-time when allowed by the current privilege level, before
loading a secure boot Linux kernel. That being said, my test
system does not use secure boot.
U-Boot 2021.01+dfsg-4 has:
/usr/share/doc/u-boot-imx/configs/config.wandboard.gz :
CONFIG_ARM_ERRATA_743622=y
CONFIG_ARM_ERRATA_751472=y
CONFIG_ARM_ERRATA_761320=y
CONFIG_ARM_ERRATA_794072=y
CONFIG_ARM_ERRATA_845369=y
About errata 742230 ("ARM errata: DMB operation may be faulty"), it only
applies to Cortex-A9 r1p0..r2p2, which explains why the wandboard U-Boot
config has it =n. Nevertheless, I attempted modifying the Linux kernel code
to explicitly change the implementation of smp_mb() from dmb to dsb, but
it did not solve the issue.
About errata 754327 ("ARM errata: no automatic Store Buffer drain"), it
applies prior to r2p0, and is enabled in my Debian kernel configuration
already, because it does not depend on !ARCH_MULTIPLATFORM. The issue
reproduces with this work-around enabled.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
More information about the linux-arm-kernel
mailing list