[PATCH v2] ARM: kexec: Use the right ISA for relocate_new_kernel
Dave P Martin
Dave.Martin at arm.com
Fri Nov 15 13:33:20 EST 2013
On Fri, Nov 15, 2013 at 06:11:43PM +0000, Taras Kondratiuk wrote:
> On 11/15/2013 07:38 PM, Dave Martin wrote:
> > On Fri, Nov 15, 2013 at 01:28:21PM +0200, Taras Kondratiuk wrote:
> >> And the issue I'm frequently facing in reloaded kernel (Thumb from ARM)
> >> is random crashes caused by undefined instructions.
> >>
> >> My observation summary:
> >> - Before starting a second kernel I'm dumping loaded zImage and then
> >> unpacked Image at final location and they are correct, so no issue
> >> with loading.
> >> - I observe two types of crash:
> >> 1) Undefined instruction in the middle of kernel code. After a crash
> >> I check failing address and there is always a *valid* Thumb
> >> instruction (CPU is in Thumb mode).
> >> 2) Jump to a wrong address which consequently causes undefined
> >> instruction exception. A trace of one example of a wrong jump is
> >> captured in [1]. Instead of jumping to 0xC049097C code gets
> >> executed at 0xED85E008. BTW the wrong address suspiciously looks
> >> like an ARM instruction.
> >
> > That jump to 0xED85E008 certainly looks strange ... I wonder whether
> > there could be some instructions missing from the trace.
> >
> >
> > How early do these crashes happen?
>
> At very early stages starting from setup_arch() up to early initcalls.
>
> > Is this happening on SMP, and if so, what is the state of secondary
> > CPUs across kexec?
>
> I have disabled CONFIG_SMP. Second CPU is busy-looping in ROM code and
> shouldn't cause any issues.
OK, that sounds reasonable.
> > If secondary CPUs are not safely parked, or their caches are not drained
> > before the kexec occurs, this can cause corruption of the new kernel
> > or unpredictable behaviour of the secondary CPUs.
> >
> >> - If second kernel is placed at different address (like in kdump case),
> >> then it boots fine and I don't observe any crashes.
> >> - If I check failing address in the first kernel (ARM) the code there
> >> is really undefined instruction if executed as Thumb.
> >> - Looks like pieces of old ARM kernel gets executed instead of new
> >> Thumb kernel. But as I've mentioned I'm reading physical memory via
> >> JTAG before starting second kernel and memory is matching a compiled
> >> Thumb 'Image'. Icache also gets cleaned...
> >> - Once when stopped on breakpoint I've seen a piece of ARM code in
> >> Thumb kernel. Interesting that I was looking at the same memory
> >
> > Thumb kernels do contain a small amount of ARM code, in the vectors
> > page for example. But it's possible you were also looking at stale
> > data.
>
> Right, but I mean there was an ARM code in place where definitely a
> Thumb code should be.
Sure. Well, I guess this remains unexplained for now, but keep me
posted.
Cheers
---Dave
More information about the linux-arm-kernel
mailing list