? FAIL (91/181 SKIPPED): Test report for for-kernelci (6.2.0-rc5, arm-next, 2e84eedb)

Mark Rutland mark.rutland at arm.com
Thu Jan 26 09:25:14 PST 2023


On Thu, Jan 26, 2023 at 03:17:16PM +0000, Will Deacon wrote:
> On Thu, Jan 26, 2023 at 03:09:58PM +0000, Mark Rutland wrote:
> > On Thu, Jan 26, 2023 at 12:52:03PM +0000, Will Deacon wrote:
> > > [+Mark and Ard in case they have ideas]
> > > 
> > > On Wed, Jan 25, 2023 at 09:21:19AM -0000, cki-project at redhat.com wrote:
> > > > Hi, we tested your kernel and here are the results:
> > > > 
> > > >     Overall result: FAILED
> > > >              Merge: OK
> > > >            Compile: OK
> > > >               Test: FAILED
> > > > 
> > > > 
> > > > Kernel information:
> > > >     Commit message: Merge branch 'for-next/core' into for-kernelci
> > > > 
> > > > You can find all the details about the test run at
> > > >     https://datawarehouse.cki-project.org/kcidb/checkouts/66828
> > > > 
> > > > One or more kernel tests failed:
> > > >     Unrecognized or new issues:
> > > >          aarch64 - kdump - kexec_boot
> > > >                    Logs: https://datawarehouse.cki-project.org/kcidb/tests/6799495
> > > 
> > > This looks like we run into an undefined instruction when we jump to the
> > > kexec relocation code. Do you know if the failure is reproducible, and is
> > > the log identical each time?
> > 
> > I had a go in a QEMU KVM VM on ThunderX2, and a QEMU KVM TCG VM. With defconfig
> > I don't see the issue, but with the config from the test run link above I
> > consistently see the issue both under KVM and TCG (logs below).
> > 
> > It should be simple enough to figure out which config option is tickling this;
> > I'll go dig in to that..
> 
> Cheers, Mark. If you get a chance, it's probably also worth testing vanilla
> -rc5 to confirm that it's a regression in our queue (which we could
> assumedly bisect if necessary).

I have met the enemy, and he is me:

| git bisect start
| # good: [2241ab53cbb5cdb08a6b2d4688feb13971058f65] Linux 6.2-rc5
| git bisect good 2241ab53cbb5cdb08a6b2d4688feb13971058f65
| # bad: [2e84eedb182e43a9113c2c83cc3373c2ae99ce19] Merge branch 'for-next/core' into for-kernelci
| git bisect bad 2e84eedb182e43a9113c2c83cc3373c2ae99ce19
| # good: [3eb1b41fba97a1586e3ecca8c10547071f541567] kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
| git bisect good 3eb1b41fba97a1586e3ecca8c10547071f541567
| # good: [daac835347a52d9d141be281e4657cc08a360e97] kselftest/arm64: Correct buffer size for SME ZA storage
| git bisect good daac835347a52d9d141be281e4657cc08a360e97
| # bad: [baaf553d3bc330697c68a00f96cf11f4edfeac7e] arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
| git bisect bad baaf553d3bc330697c68a00f96cf11f4edfeac7e
| # good: [47a15aa544279d34e14e17ca3b5855e39b946cec] arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT
| git bisect good 47a15aa544279d34e14e17ca3b5855e39b946cec
| # good: [e4ecbe83fd1a5428d5458de04a3404f1b5444429] arm64: patching: Add aarch64_insn_write_literal_u64()
| git bisect good e4ecbe83fd1a5428d5458de04a3404f1b5444429
| # good: [90955d778ad7873964a271852b1f24d31e00248b] arm64: ftrace: Update stale comment
| git bisect good 90955d778ad7873964a271852b1f24d31e00248b
| # first bad commit: [baaf553d3bc330697c68a00f96cf11f4edfeac7e] arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS

It looks like this is down to the function alignment; reverting that commit makes it go away, but if ia add:

| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
| index 6914f6bf41e22..8dafeea05864e 100644
| --- a/arch/arm64/Kconfig
| +++ b/arch/arm64/Kconfig
| @@ -123,7 +123,7 @@ config ARM64
|         select DMA_DIRECT_REMAP
|         select EDAC_SUPPORT
|         select FRAME_POINTER
| -       select FUNCTION_ALIGNMENT_4B
| +       select FUNCTION_ALIGNMENT_8B    # HACK HACK HACK
|         select GENERIC_ALLOCATOR
|         select GENERIC_ARCH_TOPOLOGY
|         select GENERIC_CLOCKEVENTS_BROADCAST

... then it blows up again.

So we're probably doing a clever address calculation in the kexec idmap code
that ends up being wrong when the code gets shuffled a bit; possibly a
mismatched caller/callee alignment.

I'll dig a bit more...

Mark.



More information about the linux-arm-kernel mailing list