[RFC PATCH 14/18] KVM: Add asynchronous userfaults, KVM_READ_USERFAULT

Nikita Kalyazin kalyazin at amazon.com
Fri Jul 26 09:50:15 PDT 2024


Hi James,

On 11/07/2024 00:42, James Houghton wrote:
> It is possible that KVM wants to access a userfault-enabled GFN in a
> path where it is difficult to return out to userspace with the fault
> information. For these cases, add a mechanism for KVM to wait for a GFN
> to not be userfault-enabled.
In this patch series, an asynchronous notification mechanism is used 
only in cases "where it is difficult to return out to userspace with the 
fault information". However, we (AWS) have a use case where we would 
like to be notified asynchronously about _all_ faults. Firecracker can 
restore a VM from a memory snapshot where the guest memory is supplied 
via a Userfaultfd by a process separate from the VMM itself [1]. While 
it looks technically possible for the VMM process to handle exits via 
forwarding the faults to the other process, that would require building 
a complex userspace protocol on top and likely introduce extra latency 
on the critical path. This also implies that a KVM API 
(KVM_READ_USERFAULT) is not suitable, because KVM checks that the ioctls 
are performed specifically by the VMM process [2]:
	if (kvm->mm != current->mm || kvm->vm_dead)
		return -EIO;

 > The implementation of this mechanism is certain to change before KVM
 > Userfault could possibly be merged.
How do you envision resolving faults in userspace? Copying the page in 
(provided that userspace mapping of guest_memfd is supported [3]) and 
clearing the KVM_MEMORY_ATTRIBUTE_USERFAULT alone do not look 
sufficient to resolve the fault because an attempt to copy the page 
directly in userspace will trigger a fault on its own and may lead to a 
deadlock in the case where the original fault was caused by the VMM. An 
interface similar to UFFDIO_COPY is needed that would allocate a page, 
copy the content in and update page tables.

[1] Firecracker snapshot restore via UserfaultFD: 
https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/handling-page-faults-on-snapshot-resume.md
[2] KVM ioctl check for the address space: 
https://elixir.bootlin.com/linux/v6.10.1/source/virt/kvm/kvm_main.c#L5083
[3] mmap() of guest_memfd: 
https://lore.kernel.org/kvm/489d1494-626c-40d9-89ec-4afc4cd0624b@redhat.com/T/#mc944a6fdcd20a35f654c2be99f9c91a117c1bed4

Thanks,
Nikita



More information about the linux-arm-kernel mailing list