[RFC PATCH] ARM: uprobes need icache flush after xol write
Russell King - ARM Linux
linux at arm.linux.org.uk
Tue Apr 8 04:46:20 PDT 2014
On Tue, Apr 08, 2014 at 09:24:50AM +0100, Dave Martin wrote:
> You're right that something is needed.
> Would flush_icache_range() not be the correct thing to call here?
> Sync'ing the I and D sides for a whole page seems excessive to
> install a probe.
> I believe __cpuc_coherent_kern_range() is not intended as an API,
> and should only be used in special situations in arch-specific code.
> It's the correct operation, but arm provides flush_icache_range as a
> wrapper to call.
> flush_icache_range() is documented, with the correct effect, in
> Documentation/cachetlb.txt, so I believe that all arches should
> provide it, even if it's a no-op.
The real question is, what is this code in xol_get_insn_slot() trying
1. area->page is an anonymous page - it was allocated by
alloc_page(GFP_HIGHUSER). This makes use of flush_dcache_page()
questionable at best.
2. xol_get_insn_slot() appears to want to copy instructions into this
page via the kernel mapping such that they're visible to userspace.
So, the first thing that we need to do is to push the written data out to
a point where the instruction stream can see it. Then we need to ensure
that the instruction stream can see the new instructions.
Practically for ARM, that means pushing it out of the D-cache and then
making sure that there are no I-cache lines associated with the new
With VIVT instruction caches (yes, we still have them on ARMv7), we have
to flush the I-cache lines associated with the userspace addresses -
flushing the I-cache lines for the kernel mapping doesn't do anything for
the stale lines in the userspace mapping.
copy_to_page() kmap_atomic()'s the page - this doesn't take account of
VIPT aliasing caches, and so the writes may be in an alias of the user
page. Plus, when the page is kunmap_atomic()'d, there's no guarantee that
the written lines will be flushed out.
copy_from_page() suffers from a similar problem, though its saving grace
is that the user mapping is execute-only (iow, read+execute only) so
should never end up with any dirty cache lines associated with it.
flush_dcache_page() will ensure that the D-cache lines are flushed out on
ARM (even though it's only defined to have an effect for page cache pages)
but it does nothing for the I-cache. So, although it seems to partially
work, it's definitely the wrong interface here.
However... the normal way we do this (for example, in the case of ptrace)
is via copy_to_user_page() / copy_from_user_page(), and we have some
pretty complex handling via those functions to deal with writing to the
page and having the affected thread running on a different CPU to the
one we wrote the new instructions. I'd recommend that uprobes looks at
trying to re-use some of this infrastructure rather than re-inventing
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
More information about the linux-arm-kernel