[PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion
Andrew Cooper
andrew.cooper3 at citrix.com
Wed Jun 12 16:06:07 PDT 2024
On 12/06/2024 10:22 am, Kirill A. Shutemov wrote:
> On Tue, Jun 11, 2024 at 11:26:17AM -0700, H. Peter Anvin wrote:
>> On 6/4/24 08:21, Kirill A. Shutemov wrote:
>>> From b45fe48092abad2612c2bafbb199e4de80c99545 Mon Sep 17 00:00:00 2001
>>> From: "Kirill A. Shutemov" <kirill.shutemov at linux.intel.com>
>>> Date: Fri, 10 Feb 2023 12:53:11 +0300
>>> Subject: [PATCHv11.1 06/19] x86/kexec: Keep CR4.MCE set during kexec for TDX guest
>>>
>>> TDX guests run with MCA enabled (CR4.MCE=1b) from the very start. If
>>> that bit is cleared during CR4 register reprogramming during boot or
>>> kexec flows, a #VE exception will be raised which the guest kernel
>>> cannot handle it.
>>>
>>> Therefore, make sure the CR4.MCE setting is preserved over kexec too and
>>> avoid raising any #VEs.
>>>
>>> The change doesn't affect non-TDX-guest environments.
>>>
>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov at linux.intel.com>
>>> ---
>>> arch/x86/kernel/relocate_kernel_64.S | 17 ++++++++++-------
>>> 1 file changed, 10 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
>>> index 085eef5c3904..9c2cf70c5f54 100644
>>> --- a/arch/x86/kernel/relocate_kernel_64.S
>>> +++ b/arch/x86/kernel/relocate_kernel_64.S
>>> @@ -5,6 +5,8 @@
>>> */
>>> #include <linux/linkage.h>
>>> +#include <linux/stringify.h>
>>> +#include <asm/alternative.h>
>>> #include <asm/page_types.h>
>>> #include <asm/kexec.h>
>>> #include <asm/processor-flags.h>
>>> @@ -145,14 +147,15 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>> * Set cr4 to a known state:
>>> * - physical address extension enabled
>>> * - 5-level paging, if it was enabled before
>>> + * - Machine check exception on TDX guest, if it was enabled before.
>>> + * Clearing MCE might not be allowed in TDX guests, depending on setup.
>>> + *
>>> + * Use R13 that contains the original CR4 value, read in relocate_kernel().
>>> + * PAE is always set in the original CR4.
>>> */
>>> - movl $X86_CR4_PAE, %eax
>>> - testq $X86_CR4_LA57, %r13
>>> - jz .Lno_la57
>>> - orl $X86_CR4_LA57, %eax
>>> -.Lno_la57:
>>> -
>>> - movq %rax, %cr4
>>> + andl $(X86_CR4_PAE | X86_CR4_LA57), %r13d
>>> + ALTERNATIVE "", __stringify(orl $X86_CR4_MCE, %r13d), X86_FEATURE_TDX_GUEST
>>> + movq %r13, %cr4
>> If this is the case, I don't really see a reason to clear MCE per se as I'm
>> guessing a machine check here will be fatal anyway? It just changes the
>> method of death.
> Andrew had a strong opinion on method of death here.
>
> https://lore.kernel.org/all/1144340e-dd95-ee3b-dabb-579f9a65b3c7@citrix.com
Not sure if I intended it to come across that strongly, but given a
choice, the !CR4.MCE death is cleaner because at least you're not
interpreting garbage and trying to use it as a valid IDT.
~Andrew
More information about the kexec
mailing list