vmalloc faulting on RISC-V

Joerg Roedel jroedel at suse.de
Fri Sep 18 08:08:39 EDT 2020


On Wed, Sep 09, 2020 at 01:35:11PM -0700, Palmer Dabbelt wrote:
> If there really is some reason shared between RISC-V and x86 that makes this
> approach infeasible then we'll have to fix it.  Knowing why x86 changed their
> approach would be the first step.

On x86 there is a need to sync vmalloc paging updates at one single
level to all page-tables in the system. This is a slow operation and
requires a global spin-lock to be taken, so x86 did not synchronize the
mappings at creation time but used faulting to sync them to other
page-tables.

But this approach is problematic for at least two reasons:

	1) The page-fault handler might also access vmalloc'ed memory,
	   which can then result in recursive faults because the
	   page-fault handler can never finish the synchronization.

	   The primary reason where this happened on x86 was with
	   tracing enabled, as the trace-buffer can be vmalloc'ed,
	   leading to recursive faults and an effectivly frozen system.

	2) Faulting for synchronization does not work if valid ->
	   invalid changes need to be synchronized. This has happened on
	   x86-32 an caused memory corruption, because the page-table
	   level which needs synchronization (PMD) is also a huge-page
	   mapping level. The ioremap code can use huge-pages, so there
	   is a need to synchronize unmappings.

So x86 worked around these problems by having a function in the kernel
called vmalloc_sync_all() which had to be called at random places to
actively synchronize the mappings. But that never solved the underlying
issues so they popped up again and again in the past.

This was the reason I removed the vmalloc_sync_all() interface [by the
time it was already split into vmalloc_sync_mappings/unmappings()] and
introduced the arch_sync_kernel_mappings() function, which is only
called when a relevant mapping changed and which makes sure the memory
is mapped when vmalloc() returns.

Initially I thought this would also allow to remove vmalloc faulting
completly, but that is not the case because of the race condition. But
arch_sync_kernel_mappings() serves as a decent replacement for the old
and even more broken vmalloc_sync_all() interface.

Regards,

	Joerg



More information about the linux-riscv mailing list