[RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

Catalin Marinas catalin.marinas at arm.com
Wed Mar 4 04:29:57 PST 2015

On Wed, Mar 04, 2015 at 12:50:57PM +0100, Ard Biesheuvel wrote:
> On 4 March 2015 at 12:35, Catalin Marinas <catalin.marinas at arm.com> wrote:
> > On Mon, Mar 02, 2015 at 06:20:19PM -0800, Mario Smarduch wrote:
> >> On 03/02/2015 08:31 AM, Christoffer Dall wrote:
> >> > However, my concern with these patches are on two points:
> >> >
> >> > 1. It's not a fix-all.  We still have the case where the guest expects
> >> > the behavior of device memory (for strong ordering for example) on a RAM
> >> > region, which we now break.  Similiarly this doesn't support the
> >> > non-coherent DMA to RAM region case.
> >> >
> >> > 2. While the code is probably as nice as this kind of stuff gets, it
> >> > is non-trivial and extremely difficult to debug.  The counter-point here
> >> > is that we may end up handling other stuff at EL2 for performanc reasons
> >> > in the future.
> >> >
> >> > Mainly because of point 1 above, I am leaning to thinking userspace
> >> > should do the invalidation when it knows it needs to, either through KVM
> >> > via a memslot flag or through some other syscall mechanism.
> >
> > I expressed my concerns as well, I'm definitely against merging this
> > series.
> Don't worry, that was never the intention, at least not as-is :-)

I wasn't worried, just wanted to make my position clearer ;).

> I think we have established that the performance hit is not the
> problem but the correctness is.

I haven't looked at the performance figures but has anyone assessed the
hit caused by doing cache maintenance in Qemu vs cacheable guest
accesses (and no maintenance)?

> I do have a remaining question, though: my original [non-working]
> approach was to replace uncached mappings with write-through
> read-allocate write-allocate,

Does it make sense to have write-through and write-allocate at the same
time? The write-allocate hint would probably be ignored as write-through
writes do not generate linefills.

> which I expected would keep the caches
> in sync with main memory, but apparently I am misunderstanding
> something here. (This is the reason for s/0xbb/0xff/ in patch #2 to
> get it to work: it replaces WT/RA/WA with WB/RA/WA)
> Is there no way to use write-through caching here?

Write-through is considered non-cacheable from a write perspective when
it does not hit in the cache. AFAIK, it should still be able to hit
existing cache lines and evict. The ARM ARM states that cache cleaning
to _PoU_ is not required for coherency when the writes are to
write-through memory but I have to dig further into the PoC because
that's what we care about here.

What platform did you test it on? I can't tell what the behaviour of
system caches is. I know they intercept explicit cache maintenance by VA
but not sure what happens to write-through writes when they hit in the
system cache (are they evicted to RAM or not?). If such write-through
writes are only evicted to the point-of-unification, they won't work
since non-cacheable accesses go all the way to PoC.

I need to do more reading through the ARM ARM, it should be hidden
somewhere ;).


