[v3 PATCH 0/6] arm64: support FEAT_BBM level 2 and large block mapping when rodata=full

Yang Shi yang at os.amperecomputing.com
Thu May 29 10:35:25 PDT 2025



On 5/29/25 8:33 AM, Ryan Roberts wrote:
> On 29/05/2025 09:48, Ryan Roberts wrote:
>
> [...]
>
>>>>> Regarding the linear map repainting, I had a chat with Catalin, and he reminded
>>>>> me of a potential problem; if you are doing the repainting with the machine
>>>>> stopped, you can't allocate memory at that point; it's possible a CPU was inside
>>>>> the allocator when it stopped. And I think you need to allocate intermediate
>>>>> pgtables, right? Do you have a solution to that problem? I guess one approach
>>>>> would be to figure out how much memory you will need and pre-allocate prior to
>>>>> stoping the machine?
>>>> OK, I don't remember we discussed this problem before. I think we can do
>>>> something like what kpti does. When creating the linear map we know how many
>>>> PUD and PMD mappings are created, we can record the number, it will tell how
>>>> many pages we need for repainting the linear map.
>>> Looking the kpti code further, it looks like kpti also allocates memory with the
>>> machine stopped, but it calls memory allocation on cpu 0 only.
>> Oh yes, I hadn't spotted that. It looks like a special case that may be ok for
>> kpti though; it's allocating a fairly small amount of memory (max levels=5 so
>> max order=3) and it's doing it with GFP_ATOMIC. So if my understanding of the
>> page allocator is correct, then this should be allocated from a per-cpu reserve?
>> Which means that it never needs to take a lock that other, stopped CPUs could be
>> holding. And GFP_ATOMIC guarrantees that the thread will never sleep, which I
>> think is not allowed while the machine is stopped.

The pcp should be set up by then, but I don't think it is actually 
populated until the first allocation happens IIRC.

>>
>>> IIUC this
>>> guarantees the code will not be called on a CPU which was inside the allocator
>>> when it stopped because CPU 0 is running stop_machine().
>> My concern was a bit more general; if any other CPU was inside the allocator
>> holding a lock when the machine was stopped, then if CPU 0 comes along and makes
>> a call to the allocator that requires the lock, then we have a deadlock.
>>
>> All that said, looking at the stop_machine() docs, it says:
>>
>>   * Description: This causes a thread to be scheduled on every cpu,
>>   * each of which disables interrupts.  The result is that no one is
>>   * holding a spinlock or inside any other preempt-disabled region when
>>   * @fn() runs.
>>
>> So I think my deadlock concern was unfounded. I think as long as you can
>> garrantee that fn() won't try to sleep then you should be safe? So I guess
>> allocating from within fn() should be safe as long as you use GFP_ATOMIC?

Yes, the deadlock should be not a concern.

The other comment also said:

  * On each target cpu, @fn is run in a process context with the highest 
priority
  * preempting any task on the cpu and monopolizing it.

Since the fn is running in a process context, so sleep should be ok? 
Sleep should just can happen when allocation requires memory reclaim due 
to insufficient memory for kpti and repainting linear map usecases. But 
I do agree GFP_ATOMIC is safer.

> I just had another conversation about this internally, and there is another
> concern; we obviously don't want to modify the pgtables while other CPUs that
> don't support BBML2 could be accessing them. Even in stop_machine() this may be
> possible if the CPU stacks and task structure (for example) are allocated out of
> the linear map.
>
> So we need to be careful to follow the pattern used by kpti; all secondary CPUs
> need to switch to the idmap (which is installed in TTBR0) then install the
> reserved map in TTBR1, then wait for CPU 0 to repaint the linear map, then have
> the secondary CPUs switch TTBR1 back to swapper then switch back out of idmap.

So the below code should be ok?

cpu_install_idmap()
Busy loop to wait for cpu 0 done
cpu_uninstall_idmap()

>
> Given CPU 0 supports BBML2, I think it can just update the linear map live,
> without needing to do the idmap dance?

Yes, I think so too.

Thanks,
Yang

>
> Thanks,
> Ryan
>
>
>> Thanks,
>> Ryan
>>




More information about the linux-arm-kernel mailing list