[PATCH] aarch64: kern_paddr_start calculation makes more accurate legacy check

Alexander Kamensky alexander.kamensky42 at gmail.com
Sat Aug 26 13:23:00 PDT 2023


Hi Pingfan,

Please see inline

On Thu, Aug 24, 2023 at 11:17 PM Pingfan Liu <piliu at redhat.com> wrote:
>
> On Sun, Aug 20, 2023 at 3:16 AM Alexander Kamensky
> <alexander.kamensky42 at gmail.com> wrote:
> >
> > In the latest openembedded with aarch64 image that uses 6.4.3 kernel on qemu I
> > tried to run makedumpfile in secondary kernel with /proc/vmcore. It failed as
> > follows:
> >
> > > root at qemuarm64:~# makedumpfile -c -F /proc/vmcore > /dev/null
> > > read_from_vmcore: Can't seek the dump memory(/proc/vmcore). (offset: ffffffc0123fa000) Invalid argument
> > > readpage_elf: Can't read the dump memory(/proc/vmcore).
> > > <snip>
> >
> > It turns out that not all parts for /proc/vmcore are readable:
> >
> > > root at qemuarm64:~# cat /proc/vmcore > /dev/null
> > > [  136.394931] Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
> > > [  136.422434] Modules linked in:
> > > <snip>
> >
> > Binary search of different kernel versions I found that the issue goes back to
> > 5.11. It works fine with 5.10 version of the kernel.
> >
> > After running secondary kernel under qemu gdb, I've found that the primary kernel
> > main memory segment in elfcorehdr has wrong address and size.
> >
> > Looking at kexec output with debug enabled at the time when it loads secondary
> > crash kernel I saw the following:
> >
> > > Kernel text Elf header: p_type = 1, p_offset = 0x4030200000 p_paddr = 0x4030200000 p_vaddr = 0xffffffffffffffff p_filesz = 0xffffffc011410000 p_memsz = 0xffffffc011410000
> >
> > These values come from elf_info variable and are set in iomem_range_callback
> > function. The function gets the following input on my system:
> >
> > root at qemuarm64:~# cat /proc/iomem | grep "Kernel code"
> >   40210000-40feffff : Kernel code
> > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' _text$'
>
> This is the crux of the problem. Could you say something about how it
> disappeared?

Here is what I found:

* in kallsyms .tmp_vmlinux.kallsyms1.syms _text I have this:

ffffffc008000000 T _text
ffffffc008010000 T __irqentry_text_start
ffffffc008010000 T _stext

* in scripts/kallsyms.c ranges defined as

static struct addr_range text_ranges[] = {
    { "_stext",     "_etext"     },
    { "_sinittext", "_einittext" },
};

* as a result _text fails check_symbol_range function its not included in
kallsyms

* from 'gdb vmlinux'

(gdb) p _text
$1 = 0xffffffc008000000 <_text> "MZ@\372'd>\024"
(gdb) p _stext
$2 = 0xffffffc008010000 <_stext> "\037 \003\325\037 \003\325\037
\003\325\037 \003\325?#\003\325\375{\273\251\375\003"

* from kernel linker script ./arch/arm64/kernel/vmlinux.lds, note
_stext follows _text and aligned by 0x00010000

OUTPUT_ARCH(aarch64)
ENTRY(_text)
jiffies = jiffies_64;
PECOFF_FILE_ALIGNMENT = 0x200;
SECTIONS
{
 /DISCARD/ : { *(.exitcall.exit) *(.discard) *(.discard.*) *(.modinfo)
*(.gnu.version*) }
 /DISCARD/ : {
  *(.interp .dynamic)
  *(.dynsym .dynstr .hash .gnu.hash)
 }
 . = ((((-(((1)) << ((((39))) - 1)))) + (0x08000000)));
 .head.text : {
  _text = .; // <------------------------------------------------
  KEEP(*(.head.text))
 }
 .text : ALIGN(0x00010000) { // <--------------------------------
  _stext = .; // <-----------------------------------------------
   . = ALIGN(8); __irqentry_text_start = .; *(.irqentry.text)
__irqentry_text_end = .;
   . = ALIGN(8); __softirqentry_text_start = .; *(.softirqentry.text)
__softirqentry_text_end = .;

* ./arch/arm64/kernel/vmlinux.lds is preproccessed file from
arch/arm64/kernel/vmlinux.lds.S, here are relevant snippets:

    .head.text : {
        _text = .;
        HEAD_TEXT
    }
    .text : ALIGN(SEGMENT_ALIGN) {    /* Real text segment        */
        _stext = .;        /* Text and read-only data    */
            IRQENTRY_TEXT
            SOFTIRQENTRY_TEXT

* and SEGMENT_ALIGN defined in

arch/arm64/include/asm/memory.h:#define SEGMENT_ALIGN           SZ_64K

* symbol_valid function from scripts/kallsyms.c will ignore range
check if --all-symbols option is passed

* from scripts/link-vmlinux.sh one can see that --all-symbols will be
added only if CONFIG_KALLSYMS_ALL is enabled

kallsyms()
{
        local kallsymopt;

        if is_enabled CONFIG_KALLSYMS_ALL; then
                kallsymopt="${kallsymopt} --all-symbols"
        fi

* in my kernel CONFIG_KALLSYMS_ALL is not set

* I've checked aarch64 fedora 38 kernel config in there
CONFIG_KALLSYMS_ALL is enabled that explains why you see this _text in
kallsyms of fedora

* so far it seems that _text address cannot be part of kallsyms unless
CONFIG_KALLSYMS_ALL is enabled or original _text is 64k aligned so it
will coincide with _stext

* in all my testing even with 5.10 kernel _text was missing in
kallsyms, that it why I thought it presence constituted legacy case,
in 5.10 case last it worked for me length check was failing and code
was taking else part using _stext anyway

* I believe my patch is correct and it should work for both cases when
CONFIG_KALLSYMS_ALL is set or not.

* Maybe we can remove whole 'kva_text_end - kva_stext == length' check
and set 'elf_info.kern_paddr_start = base' unconditionally, but I am
concerned that some older kernels will trip on it

    if (kva_text_end - kva_stext == length)
        elf_info.kern_paddr_start = base - (kva_stext - kva_text);
    else
        elf_info.kern_paddr_start = base;

Thanks,
Alexander Kamensky

> On my side, I can see this info on 6.2.14-300.fc38.aarch64
>
> grep -w _text /proc/kallsyms
> ffffc705d8500000 T _text
>
> Thanks,
>
> Pingfan
> > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' _stext$'
> > ffffffc010010000 T _stext
> > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' __init_begin$'
> > ffffffc010df0000 T __init_begin
> >
> > the function calculates elf_info.kern_paddr_start address similar to
> > these manual steps:
> >
> > base = 0x40210000
> > length = 0x40feffff - 0x40210000 + 0x1 = 0xde0000
> > kva_text_end - kva_stext = 0xffffffc010df0000 - 0xffffffc010010000 = 0xde0000
> >
> > elf_info.kern_paddr_start = base - (kva_stext - kva_text) =
> >                       0x40210000 - 0xffffffc010010000 = 0x4030200000
> >
> > since 'length' and 'kva_text_end - kva_stext' are equal the function calculate
> > elf_info.kern_paddr_start as in legacy case:
> >
> >   if (kva_text_end - kva_stext == length)
> >      elf_info.kern_paddr_start = base - (kva_stext - kva_text); // <---------
> >   else
> >      elf_info.kern_paddr_start = base;
> >
> > but my kernel is not legacy and kva_text is zero.
> >
> > The fix is just to add a check for kva_text not being zero before following the
> > legacy case.
> >
> > Signed-off-by: Alexander Kamensky <alexander.kamensky42 at gmail.com>
> > ---
> >  kexec/arch/arm64/crashdump-arm64.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kexec/arch/arm64/crashdump-arm64.c b/kexec/arch/arm64/crashdump-arm64.c
> > index 3098315..639c82a 100644
> > --- a/kexec/arch/arm64/crashdump-arm64.c
> > +++ b/kexec/arch/arm64/crashdump-arm64.c
> > @@ -82,7 +82,7 @@ static int iomem_range_callback(void *UNUSED(data), int UNUSED(nr),
> >                  * For compatibility, deduce by comparing the gap "__init_begin - _stext"
> >                  * and the res size of "Kernel code" in /proc/iomem
> >                  */
> > -               if (kva_text_end - kva_stext == length)
> > +               if ((kva_text_end - kva_stext == length) && (kva_text != 0))
> >                         elf_info.kern_paddr_start = base - (kva_stext - kva_text);
> >                 else
> >                         elf_info.kern_paddr_start = base;
> > --
> > 2.41.0
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
> >
>



More information about the kexec mailing list