dax alignment problem on arm64 (and other achitectures)
Joao Martins
joao.m.martins at oracle.com
Fri Jan 29 12:22:38 EST 2021
On 1/29/21 4:32 PM, Pavel Tatashin wrote:
> On Fri, Jan 29, 2021 at 9:51 AM Joao Martins <joao.m.martins at oracle.com> wrote:
>>
>> Hey Pavel,
>>
>> On 1/29/21 1:50 PM, Pavel Tatashin wrote:
>>>> Since we last talked about this the enabling for EFI "Special Purpose"
>>>> / Soft Reserved Memory has gone upstream and instantiates device-dax
>>>> instances for address ranges marked with EFI_MEMORY_SP attribute.
>>>> Critically this way of declaring device-dax removes the consideration
>>>> of it as persistent memory and as such no metadata reservation. So, if
>>>> you are willing to maintain the metadata external to the device (which
>>>> seems reasonable for your environment) and have your platform firmware
>>>> / kernel command line mark it as EFI_CONVENTIONAL_MEMORY +
>>>> EFI_MEMORY_SP, then these reserve-free dax-devices will surface.
>>>
>>> Hi Dan,
>>>
>>> This is cool. Does it allow conversion between devdax and fsdax so DAX
>>> aware filesystem can be installed and data can be put there to be
>>> preserved across the reboot?
>>>
>>
>> fwiw wrt to the 'preserved across kexec' part, you are going to need
>> something conceptually similar to snippet below the scissors mark.
>> Alternatively, we could fix kexec userspace to add conventional memory
>> ranges (without the SP attribute part) when it sees a Soft-Reserved region.
>> But can't tell which one is the right thing to do.
>
> Hi Joao,
>
> Is not it just a matter of appending arguments to the kernel parameter
> during kexec reboot with Soft-Reserved region specified, or am I
> missing something? I understand with fileload kexec syscall we might
> accidently load segments onto reserved region, but with the original
> kexec syscall, where we can specify destinations for each segment that
> should not be a problem with today's kexec tools.
>
efi_fake_mem only works with EFI_MEMMAP conventional memory ranges, thus
not having a EFI_MEMMAP with RAM ranges means it's a nop for the soft-reserved
regions. Unless, you trying to suggest something like:
memmap=<start>%<size>+0xefffffff
... To mark soft reserved on top an existing RAM? Sadly don't know if there's
an equivalent for ARM.
> I agree that preserving it automatically as you are proposing, would
> make more sense, instead of fiddling with kernel parameters and
> segment destinations.
>
> Thank you,
> Pasha
>
>>
>> At the moment, HMAT ranges (or those defined with efi_fake_mem=) aren't
>> preserved not because of anything special with HMAT, but simply because
>> the EFI memmap conventional ram ranges are not preserved (only runtime
>> services). And HMAT/efi_fake_mem expects these to based on EFI memmap.
>>
[snip]
More information about the linux-arm-kernel
mailing list