[PATCH] kexec_file: Drop weak attribute from arch_kexec_apply_relocations[_add]

Eric W. Biederman ebiederm at xmission.com
Fri May 20 12:25:05 PDT 2022


Baoquan He <bhe at redhat.com> writes:

> On 05/19/22 at 12:59pm, Eric W. Biederman wrote:
>> Baoquan He <bhe at redhat.com> writes:
>> 
>> > Hi Eric,
>> >
>> > On 05/18/22 at 04:59pm, Eric W. Biederman wrote:
>> >> "Naveen N. Rao" <naveen.n.rao at linux.vnet.ibm.com> writes:
>> >> 
>> >> > Since commit d1bcae833b32f1 ("ELF: Don't generate unused section
>> >> > symbols") [1], binutils (v2.36+) started dropping section symbols that
>> >> > it thought were unused.  This isn't an issue in general, but with
>> >> > kexec_file.c, gcc is placing kexec_arch_apply_relocations[_add] into a
>> >> > separate .text.unlikely section and the section symbol ".text.unlikely"
>> >> > is being dropped. Due to this, recordmcount is unable to find a non-weak
>> >> > symbol in .text.unlikely to generate a relocation record against.
>> >> >
>> >> > Address this by dropping the weak attribute from these functions:
>> >> > - arch_kexec_apply_relocations() is not overridden by any architecture
>> >> >   today, so just drop the weak attribute.
>> >> > - arch_kexec_apply_relocations_add() is only overridden by x86 and s390.
>> >> >   Retain the function prototype for those and move the weak
>> >> >   implementation into the header as a static inline for other
>> >> >   architectures.
>> >> >
>> >> > [1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1
>> >> 
>> >> Any chance you can also get machine_kexec_post_load,
>> >> crash_free_reserved_phys_range, arch_kexec_protect_protect_crashkres,
>> >> arch_kexec_unprotect_crashkres, arch_kexec_kernel_image_probe,
>> >> arch_kexec_kernel_image_probe, arch_kimage_file_post_load_cleanup,
>> >> arch_kexec_kernel_verify_sig, and arch_kexec_locate_mem_hole as well.
>> >> 
>> >> That is everything in kexec that uses a __weak symbol.  If we can't
>> >> count on them working we might as well just get rid of the rest
>> >> preemptively.
>> >
>> > Is there a new rule that __weak is not suggested in kernel any more?
>> > Please help provide a pointer if yes, so that I can learn that.
>> >
>> > In my mind, __weak is very simple and clear as a mechanism to add
>> > ARCH related functionality.
>> 
>> You should be able to trace the conversation back for all of the details
>> but if you can't here is the summary.
>> 
>> There is a tool that some architectures use called recordmcount.  The
>> recordmcount looks for a symbol in a section, and ignores all weak
>> symbols.  In certain cases sections become so simple there are only weak
>> symbols.  At which point recordmcount fails.
>> 
>> Which means in practice __weak symbols are unreliable and don't work
>> to add ARCH related functionality.
>> 
>> Given that __weak symbols fail randomly I would much rather have simpler
>> code that doesn't fail.  It has never been the case that __weak symbols
>> have been very common in the kernel.  I expect they are something like
>> bool that have been gaining traction.  Still given that __weak symbols
>> don't work.  I don't want them.
>
> Thanks for the summary, Eric.
>
> From Naveen's reply, what I got is, llvm's recent change makes
> symbol of section .text.unlikely lost,

If I have read the thread correctly this change happened in both
llvm and binutils.  So both tools chains that are used to build the
kernel.

> but the secton .text.unlikely
> still exists. The __weak symbol will be put in .text.unlikely partly,
> when arch_kexec_apply_relocations_add() includes the pr_err line. While
> removing the pr_err() line will put __weak symbol
> arch_kexec_apply_relocations_add() in .text instead.

Yes.  Calling pr_err has some effect.  Either causing an mcount
entry to be ommitted, or causing the symbols in the function to be
placed in .text.unlikely.

> Now the status is that not only recordmcount got this problem, objtools
> met it too and got an appropriate fix. Means objtools's fix doesn't need
> kernel's adjustment. Recordmcount need kernel to adjust because it lacks
> continuous support and developement. Naveen also told that they are
> converting to objtools, just the old CI cases rely on recordmcount. In
> fact, if someone stands up to get an appropriate recordmcount fix too,
> the problem will be gone too.

If the descriptions are correct I suspect recoredmcount could just
decided to use the weak symbol, and not ignore it.

Unfortunately I looked at the code and it looks like recordmcount
is only ignoring weak symbols on arm.  So without being able to
reproduce this I don't understand enough of what is going to on to fix
it.

> Asking this because __weak will be sentenced to death from now on, if we
> decide to change kernel. And this thread will be the pointer provided to
> others when telling them not to use __weak.

Well knowing that it is recordmcount all someone has to do is show that
recordmcount has been removed/fixed for the case in question.

> I am not strongly against taking off __weak, just wondering if there's
> chance to fix it in recordmcount, and the cost comparing with kernel fix;
> except of this issue, any other weakness of __weak. Noticed Andrew has
> picked this patch, as a witness of this moment, raise a tiny concern.

I just don't see what else we can realistically do.

Eric






More information about the kexec mailing list