[PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before any reservation

Mon Jan 22 21:08:14 PST 2018

Hi,

On Fri, Jan 19, 2018 at 5:46 PM, James Morse <james.morse at arm.com> wrote:
> Hello,
>
> On 16/01/18 07:07, takahiro.akashi at linaro.org wrote:
>> On Mon, Jan 15, 2018 at 10:14:05AM +0530, Bhupesh SHARMA wrote:
>>> On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal at nxp.com> wrote:
>>>>> On 08/01/18 04:31, Poonam Aggrwal wrote:
>>>>>> Yeah, this is a good point. So ideally the address of the crash kernel
>>>>>> should be diligently provided by the user based on the system.
>
>>>>> Even better: the region to store the crash kernel in should be chosen by the
>>>>> kernel.
>>>>> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to
>>>>> place the reserved region.
>>>>> Even if I specified a reasonable physical address, the
>>>>> efistub may relocate the kernel over the top during boot as part of its KASLR
>>>>> work.
>
>>>> Agree
>>>>>
>>>>> (Why does anyone ever need to specify an offset here?)
>>>> offset is normally an optional argument. Request Takahiro to provide his inputs.  Does this imply any updates in kexec design/implementation/documentation?
>>>
>>> offset is a optional argument. For relocatable kernels (and kernels
>>> which support KASLR), specifying offset is normally not needed.
>>>
>>> Please refer to the 'Extended crashkernel syntax' documentation
>>> (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>):
>>>
>>> Extended crashkernel syntax
>>> ===========================
>>>
>>> While the "crashkernel=size[@offset]" syntax is sufficient for most
>>> configurations, sometimes it's handy to have the reserved memory dependent
>>> on the value of System RAM -- that's mostly for distributors that pre-setup
>>> the kernel command line to avoid a unbootable system after some memory has
>>> been removed from the machine.
>
> This is to let the distribution provide a single value that works on machines
> with very little RAM, and machines that need gigabytes in order to boot.
>
> Where does specifying the offset/absolute-address come in?

It comes into picture where we know at compile time where we want to
place the crashkernel at.
I remember using it on a few arm 32-bit machines but they had a static
and well defined layout of the memory
map from the available system RAM, and required crashkernel to be
placed at predefined offset to prevent 'undefined' behaviour.

I don't think this is relevant though now with the newer KASLR enabled
kernels where we can RANDOMIZE the _text and also the module load
area.
So using a static offset value in this case would be risky to say the least ...

>
>>> As James mentioned for arm64, in case of relocatable/kaslr kernels,
>>> the efistub may relocate the kernel over the top during boot as part
>>> of its KASLR.
>>
>> It would be sad if we couldn't specify kaslr and kdump at the same time.
>
> We can. The problem comes when you specify an absolute-address that should be
> reserved for user-space to eventually load the kdump kernel into. This is
> fragile for a number of reasons.

... Fedora allows using kdump and KASLR both, so I am not sure that's a problem.

>> Since kaslr will skip any of memory regions whose attributes are not
>> CONVENTIONAL_MEMORY for allocating a relocated kernel image, we will be
>> able to have a dedicated range of memory reserved for kdump.
>> In this case, using an "offset" in "crashkernel=" will be crucial.
>>
>> (I don't know how we can notify uefi of the region though.)
>
> I'm confused. panic()->kdump:boot doesn't go via UEFI. It passes information in
> the DT:/chosen that may have been generated by the EFIStub, but it doesn't (and
> must not) change the EFI memory map.
>
> We need to decide where the crashdump kernel region is when the first-kernel
> generates its page tables, as the protect/unprotect mechanism wants to be able
> to unmap them.
> There is no reason for UEFI to know about the kdump region, BootServices are
> long-gone by the time its location has to be decided.

>>> So, the offset field may make more sense for
>>> non-relocatable/static kernels, but for newer kernels, its better to
>>> use the 'Extended crashkernel syntax' syntax which is also supported
>>> by newer distribution versions.
>
> (this extended crashkernel syntax looks like a tangent: its about specifying
> one-value string to specify a reservation-size on both small-memory and
> large-memory machines).
>
> I don't think 'relocatable kernels' are relevant here. The KASLR series changed
> the kernel to no longer run from the linear map, so where in the linear map we
> allocate memory for the crash-kernel to boot from can't matter. These changes
> were merged before kexec/kdump support was added.

Agreed.

> Even before this change, (I recall that:) the kernel would discard memory below
> its text. This isn't a problem as kexec-tools (typically) locates the kernel at
> the bottom of the region, and physical memory outside this range isn't
> accessible anyway because of the "linux,usable-memory" property. When we do want
> to access it, we remap it using the vmcore helpers.
>
> I think this @offset must be for kernels that have to run from a
> physical/virtual address that is known at compile time. We don't have this
> problem on arm64, and specifying @offset makes kdump less reliable:
> | cannot reserve crashkernel: region overlaps reserved memory
>
>
>> How better is it for the case?
>>
>> I don't know exactly what you mean by "newer kernel/distribution", but
>> kdump on arm64 supports this feature from the day one.
>> (It is basically independent from architectures.)
>
> Support @offset? Yes, its core code allowing this.
>
>
>>> For e.g. see ubuntu trusty kdump-config man page -
>>> <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>:
>>>
>>>   kdump kernel relocation address does not match crashkernel= parameter:
>>>               For non-relocatable architectures,  the  kdump  kernel  must  be
>>>               built   with   a  predetermined  start  address.   This  message
>>>               indicates that the start address of the  kdump  kernel  and  the
>>>               start address in the crashkernel= parameter do not match.
>
> arm64 doesn't have this issue. The 'predetermined start address' is a relative
> value stored in the header. This is so the bootloader can place the kernel
> anywhere in memory and still have it boot.
>
> I think we're in the weeds here: adding @offset to your 'crashkernel=' cmdline
> option tells the kernel you know this address will be free and not-reserved when
> it comes to reservation time. This isn't generally true.
> Unless you wrote the DT and the bootloader, you can't know this.

Agreed. See my comments above. What I meant was that using @offset with newer
kernels/distributions which support KASLR can lead to undefined
behavior. What I tried to capture
from the ubuntu trusty man-page was that "kdump kernel must be built
with a predetermined start  address
for non-relocatable architectures".

Normally we use the same primary kernel as the kdump kernel, so if the
primary kernels is Relocatable + KASLR enabled,
the kdump kernel would support the same as well. In that case assuming
a fixed offset at which the kdump kernel can be loaded at or assuming
a static
start address for the kdump kernel can lead to issues.

Regards,
Bhupesh

> Wasn't this patch moving the elfcorehdr reservation up to be before any dynamic
> reservations, to prevent them overlapping?
>
>
> Thanks,
>
> James