[PATCH v3] riscv: mm: Avoid spurious fault after hotplugging vmemmap

David Hildenbrand (Arm) david at kernel.org
Mon Jun 8 05:43:52 PDT 2026


On 6/8/26 04:24, Vivian Wang wrote:
> On 6/5/26 17:16, David Hildenbrand (Arm) wrote:
>> On 6/5/26 07:23, Vivian Wang wrote:
>>> section_activate() does not flush TLB after populating new vmemmap
>>> pages. On most architectures, this is okay. However it is a problem on
>>> RISC-V since there the TLB caching non-present entries is permitted,
>>> which causes spurious faults on some hardwares.
>>>
>>> This seems to be most easily reproduced with DEBUG_VM=y and
>>> PAGE_POISONING=y, which causes these newly mapped struct pages to be
>>> poisoned i.e. written to immediately after mapping.
>>>
>>> Add a hook vmemmap_populate_finalize() in __populate_section_memmap()
>>> after population, to allow architectures to handle such situations as
>>> needed. Then implement it on RISC-V to arrange for the existing
>>> exception handler code to deal with these faults if they happen.
>>>
>>> Signed-off-by: Vivian Wang <wangruikang at iscas.ac.cn>
>>> ---
>>> Depends on my earlier kfence fixes for mark_new_valid_map() [1].
>>>
>>> Found while testing AMD_HSA/ZONE_DEVICE on SpacemiT K3. Using
>>> ZONE_DEVICE requires another fix [2].
>>>
>>> [1]: https://lore.kernel.org/linux-riscv/20260303-handle-kfence-protect-spurious-fault-v2-0-f80d8354d79d@iscas.ac.cn
>>> [2]: https://lore.kernel.org/linux-riscv/20260309-riscv-sparsemem-vmemmap-limits-v1-2-f40efe18e3cd@iscas.ac.cn
>>> ---
>>> Changes in v3:
>>> - Merged back into one patch (Mike)
>>> - (No code changes otherwise.)
>>> - Link to v2: https://patch.msgid.link/20260604-mark-after-vmemmap-populate-v2-0-ab6a7d03b434@iscas.ac.cn
>>>
>>> Changes in v2:
>>> - Split patch in two, hook point and riscv hook 
>>> - Explain hook necessity in patch 1 message (Mike)
>>> - Make hook #define based (Mike)
>>> - Call finalize hook only on populate success
>>> - Link to v1: https://patch.msgid.link/20260525-mark-after-vmemmap-populate-v1-1-e698d859ba16@iscas.ac.cn
>>> ---
>>>  arch/riscv/include/asm/pgtable.h | 4 ++++
>>>  arch/riscv/mm/init.c             | 6 ++++++
>>>  mm/sparse-vmemmap.c              | 8 ++++++++
>>>  3 files changed, 18 insertions(+)
>>>
>>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>>> index a1a7c6520a09..aa0f50e3d534 100644
>>> --- a/arch/riscv/include/asm/pgtable.h
>>> +++ b/arch/riscv/include/asm/pgtable.h
>>> @@ -1243,6 +1243,10 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
>>>  #define TASK_SIZE	FIXADDR_START
>>>  #endif
>>>  
>>> +/* Needed on SPARSEMEM_VMEMMAP */
>>> +#define vmemmap_populate_finalize vmemmap_populate_finalize
>>> +void __meminit vmemmap_populate_finalize(void);
>>> +
>>>  #else /* CONFIG_MMU */
>>>  
>>>  #define PAGE_SHARED		__pgprot(0)
>>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>>> index 706f43523935..cf9ae4099f82 100644
>>> --- a/arch/riscv/mm/init.c
>>> +++ b/arch/riscv/mm/init.c
>>> @@ -1360,6 +1360,12 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
>>>  	 */
>>>  	return vmemmap_populate_hugepages(start, end, node, altmap);
>>>  }
>>> +
>>> +void __meminit vmemmap_populate_finalize(void)
>>> +{
>>> +	/* Avoid faults on cached non-present TLB entries. */
>>> +	mark_new_valid_map();
>>> +}
>>>  #endif
>> How does mark_new_valid_map() handle concurrent access?
>>
>> There is some comment in there, but I am not sure if that actually implies that
>> mark_new_valid_map() can be called concurrently from arbitrary context?
> 
> AFAICT, yes, it is intended to be used concurrently from arbitrary
> context, and this is why I used them.
> 
> I had not originally written the code, so some riscv maintainer can
> maybe chime in on this, but as I understand it the idea is that we only
> ever set bits in the bitmap in mark_new_valid_map() while "allocating",
> whereas the assembly code in handle_exception in
> arch/riscv/kernel/entry.S only ever clears individual bits. So at least
> the access to the bitmap itself should be fine.

Right, flush_cache_vmap() looks like a bitmap_set() -- non-atomically.

.Lnew_vmalloc_kernel_address does an atomic bitmap clearing ("Atomically reset
the current cpu bit in new_vmalloc").

I am primarily wondering whether we'd want atomic bit-setting as well?

> 
> This was added in commit 503638e0babf ("riscv: Stop emitting preventive
> sfence.vma for new vmalloc mappings").
> 
> Do you think the mark_new_valid_map() function needs extra
> synchronization like smp_wmb()?

Right, I wonder about atomic access, but also about memory barriers.

-- 
Cheers,

David



More information about the linux-riscv mailing list