[RFC PATCH v4] ARM: uprobes xol write directly to userspace
David Long
dave.long at linaro.org
Mon Apr 21 09:16:42 PDT 2014
On 04/16/14 18:25, Russell King - ARM Linux wrote:
> Our I-caches don't snoop/see the D-cache at all - so writes need to be
> pushed out to what we call the "point of unification" where the I and D
> streams meet. For anything we care about, that's normally the L2 cache -
> L1 cache is harvard, L2 cache is unified.
>
> Hence, we don't care which D-alias (if any) the data is written, so long
> as it's pushed out of the L1 data cache so that it's visible to the L1
> instruction cache.
>
> If we're writing via a different mapping to that which is being executed,
> I think the safest thing to do is to flush it out of the L1 D-cache at
> the address it was written, and then flush any stale line from the L1
> I-cache using the user address. This is quite a unique requirement, and
> we don't have anything which covers it. The closest you could get is
> to that using existing calls is:
>
> 1. write the new instruction
> 2. flush_dcache_page()
> 3. flush_cache_user_range() using the user address
>
> and I think that should be safe on all the above cache types.
>
It doesn't feel to me like we yet have a clear consensus on the appropriate
near or long-term fix for this problem. I'm worried time is short to get a
fix in for v3.15. I'm not sure how elegant that fix needs to be. I've gotten
good test runs using a modified/simplified version of Victor's arch callback
and a slight variation of Russell's sequence of operations from above:
void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
const void *src, int len)
{
void *kaddr = kmap_atomic(page);
#ifdef CONFIG_SMP
preempt_disable();
#endif
memcpy(kaddr + (vaddr & ~PAGE_MASK), src, len);
clean_dcache_area(kaddr, len);
flush_cache_user_range(vaddr, vaddr + len);
#ifdef CONFIG_SMP
preempt_enable();
#endif
kunmap_atomic(kaddr);
}
I have to say using clean_dcache_area() to write back the two words that changed
(and rest of the cache line of course) seems more appropriate than flushing a
whole page. Are there implications in doing that which makes this a bad idea
though?
At any rate, for v3.15 do we want to persue the more complex solutions with
"congruent" mappings and use of copy_to_user(), or just something like the above
(plus the rest of Victors v3 patch)? I'm sure Oleg is even less happy than me
about yet another arch_ callback but we can hold out the hope that a more elegant
solution can follow in the next release. One that might introduce risk we can't
accept in v3.15 right now (e.g.: mapping the xol area writeable for all
architectures).
I have also tested (somewhat) both Victor's unmodified v3 and v4 patches on
exynos 5250 and found them to work.
Thanks,
-dl
More information about the linux-arm-kernel
mailing list