[PATCH v1 1/3] arm64: mm: Fix rodata=full block mapping support for realm guests

Fri Mar 27 13:49:07 PDT 2026

On 3/25/26 10:29 AM, Ryan Roberts wrote:
> On 23/03/2026 21:34, Yang Shi wrote:
>>
>> On 3/23/26 6:03 AM, Ryan Roberts wrote:
>>> Commit a166563e7ec37 ("arm64: mm: support large block mapping when
>>> rodata=full") enabled the linear map to be mapped by block/cont while
>>> still allowing granular permission changes on BBML2_NOABORT systems by
>>> lazily splitting the live mappings. This mechanism was intended to be
>>> usable by realm guests since they need to dynamically share dma buffers
>>> with the host by "decrypting" them - which for Arm CCA, means marking
>>> them as shared in the page tables.
>>>
>>> However, it turns out that the mechanism was failing for realm guests
>>> because realms need to share their dma buffers (via
>>> __set_memory_enc_dec()) much earlier during boot than
>>> split_kernel_leaf_mapping() was able to handle. The report linked below
>>> showed that GIC's ITS was one such user. But during the investigation I
>>> found other callsites that could not meet the
>>> split_kernel_leaf_mapping() constraints.
>>>
>>> The problem is that we block map the linear map based on the boot CPU
>>> supporting BBML2_NOABORT, then check that all the other CPUs support it
>>> too when finalizing the caps. If they don't, then we stop_machine() and
>>> split to ptes. For safety, split_kernel_leaf_mapping() previously
>>> wouldn't permit splitting until after the caps were finalized. That
>>> ensured that if any secondary cpus were running that didn't support
>>> BBML2_NOABORT, we wouldn't risk breaking them.
>>>
>>> I've fix this problem by reducing the black-out window where we refuse
>>> to split; there are now 2 windows. The first is from T0 until the page
>>> allocator is inititialized. Splitting allocates memory for the page
>>> allocator so it must be in use. The second covers the period between
>>> starting to online the secondary cpus until the system caps are
>>> finalized (this is a very small window).
>>>
>>> All of the problematic callers are calling __set_memory_enc_dec() before
>>> the secondary cpus come online, so this solves the problem. However, one
>>> of these callers, swiotlb_update_mem_attributes(), was trying to split
>>> before the page allocator was initialized. So I have moved this call
>>> from arch_mm_preinit() to mem_init(), which solves the ordering issue.
>>>
>>> I've added warnings and return an error if any attempt is made to split
>>> in the black-out windows.
>>>
>>> Note there are other issues which prevent booting all the way to user
>>> space, which will be fixed in subsequent patches.
>> Hi Ryan,
>>
>> Thanks for putting everything to together to have the patches so quickly. It
>> basically looks good to me. However, I'm thinking about whether we should have
>> split_kernel_leaf_mapping() call for different memory allocators in different
>> stages. If buddy has been initialized, it can call page allocator, otherwise,
>> for example, in early boot stage, it can call memblock allocator. So
>> split_kernel_leaf_mapping() should be able to be called anytime and we don't
>> have to rely on the boot order of subsystems.
> I considered that, but ultimately we would just be adding dead code. I've added
> a warning that will catch this usage. So I'd prefer to leave it as is for now
> and only add this functionality if we identify a need.

OK, fine to me. I don't have strong preference for either.

Thanks,
Yang

>
> Thanks,
> Ryan
>
>
>> Thanks,
>> Yang
>>