[PATCH v3 0/5] RISC-V: Add kexec/kdump support

Fri Apr 9 11:02:41 BST 2021

Στις 2021-04-07 19:29, Rob Herring έγραψε:
> On Mon, Apr 05, 2021 at 11:57:07AM +0300, Nick Kossifidis wrote:
>> This patch series adds kexec/kdump and crash kernel
>> support on RISC-V. For testing the patches a patched
>> version of kexec-tools is needed (still a work in
>> progress) which can be found at:
>> 
>> https://riscv.ics.forth.gr/kexec-tools-patched.tar.xz
>> 
>> v3:
>>  * Rebase on newer kernel tree
>>  * Minor cleanups
>>  * Split UAPI changes to a separate patch
>>  * Improve / cleanup init_resources
>>  * Resolve Palmer's comments
>> 
>> v2:
>>  * Rebase on newer kernel tree
>>  * Minor cleanups
>>  * Properly populate the ioresources tre, so that it
>>    can be used later on for implementing strict /dev/mem
>>  * Use linux,usable-memory on /memory instead of a new binding
> 
> Where? In any case, that's not going to work well with EFI support
> assuming like arm64, 'memory' is passed in UEFI structures instead of
> DT. That's why there's now a /chosen linux,usable-memory-ranges
> property.
> 

Here:
https://elixir.bootlin.com/linux/v5.12-rc5/source/drivers/of/fdt.c#L1001

The "linux,usable-memory" binding is already defined and is part of
early_init_dt_scan_memory() which we call on mm/init.c to determine
system's memory layout. It's simple, clean and I don't see a reason
to use another binding on /chosen and add extra code for this, when
we already handle it on early_init_dt_scan_memory() anyway. As for
EFI, even when enabled, we still use DT to determine system memory
layout, not EFI structures, plus I don't see how EFI is relevant
here, the bootloader in kexec's case is Linux, not EFI. BTW the /memory
node is mandatory in any case, it should exist on DT regardless of EFI,
/chosen node on the other hand is -in general- optional, and we can 
still
boot a riscv system without /chosen node present (we only require it for
the built-in cmdline to work).

Also a simple grep for "linux,usable-memory-ranges" on the latest kernel
sources didn't return anything, there is also nothing on chosen.txt, 
where
is that binding documented/implemented ?

> Isn't the preferred kexec interface the file based interface? I'd
> expect a new arch to only support that. And there's common kexec DT
> handling for that pending for 5.13.
> 

Both approaches have their pros an cons, that's why both are available, 
in no
way CONFIG_KEXEC is deprecated in favor of CONFIG_KEXEC_FILE, at least 
not as
far as I know. The main point for the file-based syscall is to support 
secure
boot, since the image is loaded by the kernel directly without any 
processing
by the userspace tools, so it can be pre-signed by the kernel's 
"vendor". On
the other hand, the kernel part is more complicated and you can't pass a 
new
device tree, the kernel needs to re-use the existing one (or modify it
in-kernel), you can only override the cmdline.

This doesn't work for our use cases in FORTH, where we use kexec not 
only to
re-boot our systems, but also to boot to a system with different hw 
layout
(e.g. FPGA prototypes or systems with FPGAs on the side), device tree 
overlays
also don't cover our use cases. To give you an idea we can 
add/remove/modify
devices, move them to another region etc and still use kexec to avoid 
going
through the full boot cycle. We just unload their drivers, perform a 
full or
partial re-programming of the FPGA from within Linux, and kexec to the 
new
system with the new device tree. The file-based syscall can't cover this
scenario, in general it's less flexible and it's only there for secure 
boot,
not for using custom-built kernels, nor custom device tree images.

Security-wise the file load syscall provides guarantees for integrity 
and
authenticity, but depending on the kernel "vendor"'s infrastructure and
signing process this may allow e.g. to load an older/vulnerable kernel 
through
kexec and get away with it, there is no check as far as I know  to make 
sure
the loaded kernel is at least as old as the running kernel, the 
assumption is
that the "vendor" will use a different signing key/cert for each kernel 
and
that you'll kexec to a kernel/crash kernel that's the same version as 
the
running one. Until we have clear guidelines on how this is meant to be 
used
and have a discussion on secure boot within RISC-V (we have something on
the TEE TG but we'll probably switch to a SIG committee for this), I 
don't
see how this feature is a priority compared to the more generic 
CONFIG_KEXEC.

Regards,
Nick