[PATCH] ARM64: KVM: Fix coherent_icache_guest_page() for host with external L3-cache.

Catalin Marinas catalin.marinas at arm.com
Fri Aug 30 08:52:25 EDT 2013


On Fri, Aug 30, 2013 at 11:36:30AM +0100, Anup Patel wrote:
> On Fri, Aug 30, 2013 at 3:14 PM, Catalin Marinas
> <catalin.marinas at arm.com> wrote:
> > On Thu, Aug 29, 2013 at 05:02:50PM +0100, Anup Patel wrote:
> >> Actually, L3-cache monitors the types of read/write generated by CPU (i.e.
> >> whether the request is cacheable/non-cacheable or whether the request is
> >> due to DC ops to PoC, or ...).
> >>
> >> To answer your query, there is no configuration to have L3 caching when
> >> accesses are non-cacheable and DC ops to PoC.
> >
> > So it's an outer cache with some "improvements" to handle DC ops to PoC.
> > I think it was a pretty bad decision on the hardware side as we really
> > try to get rid of outer caches for many reasons:
> 
> Getting rid off outer-caches (such as in this context) may make sense for
> Embedded/Low-end systems but for Servers L3-cache makes lot of sense.
> 
> Claiming this to be a bad decision all depends on what end application
> you are looking at.

It's not a bad decision to have big L3 cache, that's a very *good*
decision for server applications. The bad part is that it is not fully
integrated with the CPU (like allowing set/way operations to flush this
cache level).

> > 1. Non-standard cache flushing operations (MMIO-based)
> > 2. It may require cache maintenance by physical address - something
> >    hard to get in a guest OS (unless you virtualise L3 cache
> >    maintenance)
> 
> We don't have cache maintenance by physical address in our L3-cache.

Great.

> > 3. Are barriers like DSB propagated correctly? Does a DC op to PoC
> >    followed by DSB ensure that the L3 drained the cachelines to RAM?
> 
> Yes, DSB works perfectly fine for us with L3.
> Yes, DC op to PoC followed by DSB works fine with L3 too.

Even better.

See my other comment on flush_dcache_all() use in the kernel/KVM and why
I don't think it is suitable (which leaves us with DC ops to PoC in
KVM).

> > I think point 2 isn't required because your L3 detects DC ops to PoC. I
> > hope point 3 is handled correctly (otherwise look how "nice" the mb()
> > macro on arm is to cope with L2x0).
> >
> > If only 1 is left, we don't need the full outer_cache framework but it
> > still needs to be addressed since the assumption is that flush_cache_all
> > (or __flush_dcache_all) flushes all cache levels. These are not used in
> > generic code but are used during kernel booting, KVM and cpuidle
> > drivers.
> >
> >> > Now, back to the idea of outer_cache framework for arm64. Does your CPU
> >> > have separate instructions for flushing this L3 cache?
> >>
> >> No, CPU does not have separate instruction for flushing L3-cache. On the
> >> other hand, L3-cache has MMIO registers which can be use to explicitly
> >> flush L3-cache.
> >
> > I guess you use those in your firmware or boot loader since Linux
> > requires clean/invalidated caches at boot (and I plan to push a patch
> > which removes kernel D-cache cleaning during boot to spot such problems
> > early). A cpuidle driver would probably need this as well.
> 
> Yes, cpuidle would be another place where we may need L3-cache
> maintenance. In other words, if we need to power-off L3-cache from
> kernel then we need to first flush it.

The problem is if you need to disable the MMU in the kernel, you would
need to flush the L3 cache first. Normally this would be done in the
firmware with PSCI but most likely not the case for the Applied
hardware.

-- 
Catalin



More information about the linux-arm-kernel mailing list