kexec-starting-kernel-problem-on-vm

Baoquan He bhe at redhat.com
Mon Apr 19 09:26:16 BST 2021


Hi Jingxian,

On 04/14/21 at 03:04pm, Jingxian He wrote:
> We use ‘kexec –l’ and ‘kexec –e’ on our virtual machine to upgrade the
> linux kernel. We find that the new kernel may start fail due to checking
> the sha256 sum of the initrd segment checking fail with low probability.
> 
> The related code is as following:
> /* arch/x86/purgatory/purgatory.c */
> static int verify_sha256_digest(void)
> {
> 	struct kexec_sha_region *ptr, *end;
> 	u8 digest[SHA256_DIGEST_SIZE];
> 	struct sha256_state sctx;
> 
> 	sha256_init(&sctx);
> 	end = purgatory_sha_regions + ARRAY_SIZE(purgatory_sha_regions);
> 
> 	for (ptr = purgatory_sha_regions; ptr < end; ptr++)
> 		sha256_update(&sctx, (uint8_t *)(ptr->start), ptr->len);
> 
> 	sha256_final(&sctx, digest);
> 
> 	if (memcmp(digest, purgatory_sha256_digest, sizeof(digest)))
> 		return 1;
> 
> 	return 0;
> }
> 
> void purgatory(void)
> {
> 	int ret;
> 
> 	ret = verify_sha256_digest();

I usually use qemu/kvm guest to test kernel, kexec and kdump, haven't
met this issue. kexec -l/-e works well for me. Seems you are not using
the latest kexec-tools. Otherwise you can use "-i (--no-checks)" to work
around this for the time being.

> 	if (ret) { //<------verify_sha256 fail, entering loop forever
> 		/* loop forever */
> 		for (;;)
> 			;
> 	}
> 	copy_backup_region();
> }
> 
> 
> Our opnion of this problem:
> We think that the process of relocating the new kernel depending on the
> boot cpu running without interruption. However, the vcpus may be interrupted
> by the qemu process with async_page_fault interruption.
> There exists memory overriding risk when the boot vcpu relocate the new kernel.
> 
> When we enable the KVM_GUEST feature, and make the memory less than 500M,
> The new kernel starting problem with ‘kexec -l/-e’ will happen at every time.

I am not familiar with qemu/kvm, so add several Virt experts to CC list,
see if they have idea about it. Meanwhile, you might need to provide
the kernel version you are testing. And also wondering if you have tried
with latest kexec-tools.

Thanks
Baoquan




More information about the kexec mailing list