undefined instruction d5380001 (arm64 mrs emulation)

Catalin Marinas catalin.marinas at arm.com
Thu Oct 5 09:16:46 PDT 2017


Hi Matthias,

On Thu, Oct 05, 2017 at 04:54:09PM +0200, Matthias Brugger wrote:
> On 10/04/2017 11:11 AM, Matwey V. Kornilov wrote:
> > The patch helps to overcome the issue, Probably it should be applied
> > to all stable releases affected by this behaviour.
> > modprobe in initrd may load quite required things.
> > 
> > 2017-10-02 18:56 GMT+03:00 Suzuki K Poulose <Suzuki.Poulose at arm.com>:
> > > On Mon, Oct 02, 2017 at 03:11:18PM +0100, James Morse wrote:
> > > > On 02/10/17 12:24, Dave Martin wrote:
> > > > > On Fri, Sep 29, 2017 at 10:23:54PM +0300, Matwey V. Kornilov wrote:
> > > > > > I am running 4.13.3 on rockchip 3328 platform(aarch64) with glibc 2.26
> > > > > > and see the following at booting:
> > > > > > 
> > > > > > [   11.152061] modprobe[93]: undefined instruction: pc=0000ffff8ca48ff4
> > > > > > [   11.152707] Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
> > > > > > [   11.154347] modprobe[94]: undefined instruction: pc=0000ffff94243ff4
> > > > > > [   11.154991] Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
> > > > > > [   11.157070] modprobe[97]: undefined instruction: pc=0000ffff839a0ff4
> > > > > > [   11.157715] Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
> > > > > > [   11.159265] modprobe[98]: undefined instruction: pc=0000ffffb0591ff4
> > > > > > [   11.159908] Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
> > > > > > 
> > > > > > As far as I understand d5380001 should be emulated in cpufeature.c but
> > > > > > it is not. What could be wrong here?
> > > > > 
> > > > > The whole sequence is
> > > > > 
> > > > >     0:   d503201f        nop
> > > > >     4:   8a180320        and     x0, x25, x24
> > > > >     8:   92750001        and     x1, x0, #0x800
> > > > >     c:   365ffc20        tbz     w0, #11, 0xffffffffffffff90
> > > > >    10:*  d5380001        mrs     x1, midr_el1            <-- trapping instruction
> > > > 
> > > > This looks the same as:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1496209
> > > > 
> > > > [...]
> > > > 
> > > > > What should happen here is that the do_undefinstr() in
> > > > > arch/arm64/kernel/traps.c should call registered undef hooks until it
> > > > > finds one that accepts the faulting instruction.
> > > > > 
> > > > > So, either the cpufeatures undef hook is not getting called, or it is
> > > > > failing the instruction somewhere, possibly in
> > > > > cpufeatures.c:emulate_id_reg() or emulate_sys_reg().
> > > > > 
> > > > > 
> > > > > Can you add some trace to those functions to see what's happening?
> > > > 
> > > > I couldn't reproduce this with linux-stable's v4.13.3 defconfig on Seattle or Juno.
> > > > 
> > > > What distribution are you running? Could you also try [0] to see if this is
> > > > something specific to your version of modprobe?
> > > 
> > > 
> > > It is worth noting that we register the MRS instruction handler as late_init call.
> > > Now the question is how late that could be. Given that we are hitting it with
> > > modprobe, which could be used for requesting modules from initrd. Also which explains
> > > why it we can't reproduce it by simple testcases, after it was registered.
> > > 
> > > Now the question is, how early do we want to push this. Since it doesn't depend really
> > > on any other subsystem, we could move it as early as "early". Or for keeping it in
> > > line with other "arch" specific init calls, we could simply make it arch_initcall.
> > > 
> > > Matwey,
> > > 
> > > Please could you check if the following patch fixes the issue for you:
> > > 
> > > Cheers
> > > Suzuki
> > > 
> > > ----8>----
> > > 
> > > arm64: Enable MRS emulation early enough in the boot sequence
> > > 
> > > Make sure the MRS emulation is enabled early enough that the
> > > early userspace applications (e.g, those run from initrd) could
> > > run without any trouble.
> > > 
> > > Signed-off-by: Suzuki K Poulose <suzuki.poulose at arm.com>
> > > 
> > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > index 9f9e0064c8c1..048f5469531f 100644
> > > --- a/arch/arm64/kernel/cpufeature.c
> > > +++ b/arch/arm64/kernel/cpufeature.c
> > > @@ -1294,4 +1294,4 @@ static int __init enable_mrs_emulation(void)
> > >          return 0;
> > >   }
> > > 
> > > -late_initcall(enable_mrs_emulation);
> > > +arch_initcall(enable_mrs_emulation);
> > > ---
> 
> I realized this patch did not land in v4.13.5
> Did it got forgotten or are there any concerns?
> 
> We also hit this bug in openSUSE Tumbleweed:
> https://bugzilla.suse.com/show_bug.cgi?id=1061188

As Mark replied, we are still debating why this happens and whether the
above fix is sufficient. As we were digging further, we realised there
is no clear init level after which user space can be invoked, which
means Suzuki's patch may not always be sufficient.

I proposed something as a way of spotting this issue early [1] but I
need to post it on the linux-arch to get some consensus.

Can you post the full kernel log somewhere? I'm trying to figure out
what trigged the modprobe during the kernel boot.

Thanks,

Catalin

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-October/534465.html



More information about the linux-arm-kernel mailing list