[PATCH] aarch64: kern_paddr_start calculation makes more accurate legacy check

Pingfan Liu piliu at redhat.com
Mon Aug 28 19:35:24 PDT 2023


On Sun, Aug 27, 2023 at 4:23 AM Alexander Kamensky
<alexander.kamensky42 at gmail.com> wrote:
>
> Hi Pingfan,
>
> Please see inline
>
> On Thu, Aug 24, 2023 at 11:17 PM Pingfan Liu <piliu at redhat.com> wrote:
> >
> > On Sun, Aug 20, 2023 at 3:16 AM Alexander Kamensky
> > <alexander.kamensky42 at gmail.com> wrote:
> > >
> > > In the latest openembedded with aarch64 image that uses 6.4.3 kernel on qemu I
> > > tried to run makedumpfile in secondary kernel with /proc/vmcore. It failed as
> > > follows:
> > >
> > > > root at qemuarm64:~# makedumpfile -c -F /proc/vmcore > /dev/null
> > > > read_from_vmcore: Can't seek the dump memory(/proc/vmcore). (offset: ffffffc0123fa000) Invalid argument
> > > > readpage_elf: Can't read the dump memory(/proc/vmcore).
> > > > <snip>
> > >
> > > It turns out that not all parts for /proc/vmcore are readable:
> > >
> > > > root at qemuarm64:~# cat /proc/vmcore > /dev/null
> > > > [  136.394931] Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP
> > > > [  136.422434] Modules linked in:
> > > > <snip>
> > >
> > > Binary search of different kernel versions I found that the issue goes back to
> > > 5.11. It works fine with 5.10 version of the kernel.
> > >
> > > After running secondary kernel under qemu gdb, I've found that the primary kernel
> > > main memory segment in elfcorehdr has wrong address and size.
> > >
> > > Looking at kexec output with debug enabled at the time when it loads secondary
> > > crash kernel I saw the following:
> > >
> > > > Kernel text Elf header: p_type = 1, p_offset = 0x4030200000 p_paddr = 0x4030200000 p_vaddr = 0xffffffffffffffff p_filesz = 0xffffffc011410000 p_memsz = 0xffffffc011410000
> > >
> > > These values come from elf_info variable and are set in iomem_range_callback
> > > function. The function gets the following input on my system:
> > >
> > > root at qemuarm64:~# cat /proc/iomem | grep "Kernel code"
> > >   40210000-40feffff : Kernel code
> > > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' _text$'
> >
> > This is the crux of the problem. Could you say something about how it
> > disappeared?
>
> Here is what I found:
>
> * in kallsyms .tmp_vmlinux.kallsyms1.syms _text I have this:
>
> ffffffc008000000 T _text
> ffffffc008010000 T __irqentry_text_start
> ffffffc008010000 T _stext
>
> * in scripts/kallsyms.c ranges defined as
>
> static struct addr_range text_ranges[] = {
>     { "_stext",     "_etext"     },
>     { "_sinittext", "_einittext" },
> };
>
> * as a result _text fails check_symbol_range function its not included in
> kallsyms
>
> * from 'gdb vmlinux'
>
> (gdb) p _text
> $1 = 0xffffffc008000000 <_text> "MZ@\372'd>\024"
> (gdb) p _stext
> $2 = 0xffffffc008010000 <_stext> "\037 \003\325\037 \003\325\037
> \003\325\037 \003\325?#\003\325\375{\273\251\375\003"
>
> * from kernel linker script ./arch/arm64/kernel/vmlinux.lds, note
> _stext follows _text and aligned by 0x00010000
>
> OUTPUT_ARCH(aarch64)
> ENTRY(_text)
> jiffies = jiffies_64;
> PECOFF_FILE_ALIGNMENT = 0x200;
> SECTIONS
> {
>  /DISCARD/ : { *(.exitcall.exit) *(.discard) *(.discard.*) *(.modinfo)
> *(.gnu.version*) }
>  /DISCARD/ : {
>   *(.interp .dynamic)
>   *(.dynsym .dynstr .hash .gnu.hash)
>  }
>  . = ((((-(((1)) << ((((39))) - 1)))) + (0x08000000)));
>  .head.text : {
>   _text = .; // <------------------------------------------------
>   KEEP(*(.head.text))
>  }
>  .text : ALIGN(0x00010000) { // <--------------------------------
>   _stext = .; // <-----------------------------------------------
>    . = ALIGN(8); __irqentry_text_start = .; *(.irqentry.text)
> __irqentry_text_end = .;
>    . = ALIGN(8); __softirqentry_text_start = .; *(.softirqentry.text)
> __softirqentry_text_end = .;
>
> * ./arch/arm64/kernel/vmlinux.lds is preproccessed file from
> arch/arm64/kernel/vmlinux.lds.S, here are relevant snippets:
>
>     .head.text : {
>         _text = .;
>         HEAD_TEXT
>     }
>     .text : ALIGN(SEGMENT_ALIGN) {    /* Real text segment        */
>         _stext = .;        /* Text and read-only data    */
>             IRQENTRY_TEXT
>             SOFTIRQENTRY_TEXT
>
> * and SEGMENT_ALIGN defined in
>
> arch/arm64/include/asm/memory.h:#define SEGMENT_ALIGN           SZ_64K
>
> * symbol_valid function from scripts/kallsyms.c will ignore range
> check if --all-symbols option is passed
>
> * from scripts/link-vmlinux.sh one can see that --all-symbols will be
> added only if CONFIG_KALLSYMS_ALL is enabled
>
> kallsyms()
> {
>         local kallsymopt;
>
>         if is_enabled CONFIG_KALLSYMS_ALL; then
>                 kallsymopt="${kallsymopt} --all-symbols"
>         fi
>
> * in my kernel CONFIG_KALLSYMS_ALL is not set
>
> * I've checked aarch64 fedora 38 kernel config in there
> CONFIG_KALLSYMS_ALL is enabled that explains why you see this _text in
> kallsyms of fedora
>
> * so far it seems that _text address cannot be part of kallsyms unless
> CONFIG_KALLSYMS_ALL is enabled or original _text is 64k aligned so it
> will coincide with _stext
>
> * in all my testing even with 5.10 kernel _text was missing in
> kallsyms, that it why I thought it presence constituted legacy case,
> in 5.10 case last it worked for me length check was failing and code
> was taking else part using _stext anyway
>
> * I believe my patch is correct and it should work for both cases when
> CONFIG_KALLSYMS_ALL is set or not.
>

Now, the crux is clear. It is introduced by the filtering of
CONFIG_KALLSYMS_ALL. And the kernel layout has not changed.

> * Maybe we can remove whole 'kva_text_end - kva_stext == length' check
> and set 'elf_info.kern_paddr_start = base' unconditionally, but I am
> concerned that some older kernels will trip on it
>

Yes, compatibility is a challenge, that is why the code looks like this.

Furthermore, since the kernel layout does not change, the code logic
stands. And I think you should consider about changing
"elf_info.kern_vaddr_start = get_kernel_sym("_text");"  to ("_stext").

Finally you can unify the changes there and the following code snippet.

Thanks,

Pingfan

>     if (kva_text_end - kva_stext == length)
>         elf_info.kern_paddr_start = base - (kva_stext - kva_text);
>     else
>         elf_info.kern_paddr_start = base;
>
> Thanks,
> Alexander Kamensky
>
> > On my side, I can see this info on 6.2.14-300.fc38.aarch64
> >
> > grep -w _text /proc/kallsyms
> > ffffc705d8500000 T _text
> >
> > Thanks,
> >
> > Pingfan
> > > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' _stext$'
> > > ffffffc010010000 T _stext
> > > root at qemuarm64:~# cat /proc/kallsyms | grep -e ' __init_begin$'
> > > ffffffc010df0000 T __init_begin
> > >
> > > the function calculates elf_info.kern_paddr_start address similar to
> > > these manual steps:
> > >
> > > base = 0x40210000
> > > length = 0x40feffff - 0x40210000 + 0x1 = 0xde0000
> > > kva_text_end - kva_stext = 0xffffffc010df0000 - 0xffffffc010010000 = 0xde0000
> > >
> > > elf_info.kern_paddr_start = base - (kva_stext - kva_text) =
> > >                       0x40210000 - 0xffffffc010010000 = 0x4030200000
> > >
> > > since 'length' and 'kva_text_end - kva_stext' are equal the function calculate
> > > elf_info.kern_paddr_start as in legacy case:
> > >
> > >   if (kva_text_end - kva_stext == length)
> > >      elf_info.kern_paddr_start = base - (kva_stext - kva_text); // <---------
> > >   else
> > >      elf_info.kern_paddr_start = base;
> > >
> > > but my kernel is not legacy and kva_text is zero.
> > >
> > > The fix is just to add a check for kva_text not being zero before following the
> > > legacy case.
> > >
> > > Signed-off-by: Alexander Kamensky <alexander.kamensky42 at gmail.com>
> > > ---
> > >  kexec/arch/arm64/crashdump-arm64.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kexec/arch/arm64/crashdump-arm64.c b/kexec/arch/arm64/crashdump-arm64.c
> > > index 3098315..639c82a 100644
> > > --- a/kexec/arch/arm64/crashdump-arm64.c
> > > +++ b/kexec/arch/arm64/crashdump-arm64.c
> > > @@ -82,7 +82,7 @@ static int iomem_range_callback(void *UNUSED(data), int UNUSED(nr),
> > >                  * For compatibility, deduce by comparing the gap "__init_begin - _stext"
> > >                  * and the res size of "Kernel code" in /proc/iomem
> > >                  */
> > > -               if (kva_text_end - kva_stext == length)
> > > +               if ((kva_text_end - kva_stext == length) && (kva_text != 0))
> > >                         elf_info.kern_paddr_start = base - (kva_stext - kva_text);
> > >                 else
> > >                         elf_info.kern_paddr_start = base;
> > > --
> > > 2.41.0
> > >
> > >
> > > _______________________________________________
> > > kexec mailing list
> > > kexec at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/kexec
> > >
> >
>




More information about the kexec mailing list