[PATCH 0/7] ARM: KVM: Revamping the HYP init code for fun and profit

Christoffer Dall cdall at cs.columbia.edu
Wed Apr 3 19:18:45 EDT 2013


On Tue, Apr 02, 2013 at 02:25:08PM +0100, Marc Zyngier wrote:
> Over the past few weeks, I've gradually realized how broken our HYP
> idmap code is. Badly broken.
> 
> The main problem is about supporting CPU hotplug. Imagine a CPU being
> initialized normally, running VMs, and then being powered down. So
> far, so good. Now mentally bring it back online. The CPU will come
> back via the secondary CPU boot path, and then what? We cannot use it
> anymore, because we need an idmap which is long gone, and because our
> page tables are now live, containing the world-switch code, VM
> structures, and other bits and pieces.
> 
> Another fun issue is that we don't have any TLB invalidation in the
> HYP init code. And guess what? we cannot do it! HYP TLB invalidation
> has to occur in HYP, and once we've installed the runtime page tables,
> it is already too late. It is actually fairly easy to construct a
> scenario where idmap and runtime pages have colliding translations.
> 
> The nail on the coffin was provided by Catalin Marinas who told me how
> much he disliked the arm64 HYP idmap code, and made me realize that we
> already have all the necessary code in arch/arm/kvm/mmu.c. It just
> needs a tiny bit of care and affection. With a chainsaw.
> 
> The solution to the first two issues is a bit tricky, but doesn't
> involve a lot of code. The hotplug problem mandates that we keep two
> sets of page tables (boot and runtime). The TLB problem mandates that
> we're able to transition from one PGD to another while in HYP,
> invalidating the TLBs in the process.
> 
> To be able to do this, we need to share a page between the two page
> tables. A page that will have the same VA in both configurations. All
> we need is a VA that has the following properties:
> - This VA can't be used to represent a kernel mapping.
> - This VA will not conflict with the physical address of the kernel
>   text
> 
> The vectors page VA seems to satisfy this requirement:
> - The kernel never maps anything else there
> - The kernel text being copied at the beginning of the physical
>   memory, it is unlikely to use the last 64kB (I doubt we'll ever
>   support KVM on a system with something like 4MB of RAM, but patches
>   are very welcome).
> 
> Let's call this VA the trampoline VA.
> 
> Now, we map our init page at 3 locations:
> - idmap in the boot pgd
> - trampoline VA in the boot pgd
> - trampoline VA in the runtime pgd
> 
> The init scenario is now the following:
> - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
>   runtime stack, runtime vectors
> - Enable the MMU with the boot pgd
> - Jump to a target into the trampoline page (remember, this is the
>   same physical page!)
> - Now switch to the runtime pgd (same VA, and still the same physical
>   page!)
> - Invalidate TLBs
> - Set stack and vectors
> - Profit! (or eret, if you only care about the code).
> 
> Once we have this infrastructure in place, supporting CPU hot-plug is
> a piece of cake. Just wire a cpu-notifier in the existing code.
> 
> This has been tested on both arm (VE TC2) and arm64 (Foundation Model).
> 

So this looks quite good overall, thanks for taking care of this.  When
you send out a V2 it should be ready that I can take it in my tree and
send it further.

-Christoffer



More information about the linux-arm-kernel mailing list