[RFC please help] membarrier: Rewrite sync_core_before_usermode()

Andy Lutomirski luto at kernel.org
Mon Dec 28 14:44:33 EST 2020


On Mon, Dec 28, 2020 at 11:09 AM Russell King - ARM Linux admin
<linux at armlinux.org.uk> wrote:
>
> On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
> > After chatting with rmk about this (but without claiming that any of
> > this is his opinion), based on the manpage, I think membarrier()
> > currently doesn't really claim to be synchronizing caches? It just
> > serializes cores. So arguably if userspace wants to use membarrier()
> > to synchronize code changes, userspace should first do the code
> > change, then flush icache as appropriate for the architecture, and
> > then do the membarrier() to ensure that the old code is unused?
> >
> > For 32-bit arm, rmk pointed out that that would be the cacheflush()
> > syscall. That might cause you to end up with two IPIs instead of one
> > in total, but we probably don't care _that_ much about extra IPIs on
> > 32-bit arm?
> >
> > For arm64, I believe userspace can flush icache across the entire
> > system with some instructions from userspace - "DC CVAU" followed by
> > "DSB ISH", or something like that, I think? (See e.g.
> > compat_arm_syscall(), the arm64 compat code that implements the 32-bit
> > arm cacheflush() syscall.)
>
> Note that the ARM cacheflush syscall calls flush_icache_user_range()
> over the range of addresses that userspace has passed - it's intention
> since day one is to support cases where userspace wants to change
> executable code.
>
> It will issue the appropriate write-backs to the data cache (DCCMVAU),
> the invalidates to the instruction cache (ICIMVAU), invalidate the
> branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
> the appropriate barriers (DSB ISHST, ISB).
>
> Note that neither flush_icache_user_range() nor flush_icache_range()
> result in IPIs; cache operations are broadcast across all CPUs (which
> is one of the minimums we require for SMP systems.)
>
> Now, that all said, I think the question that has to be asked is...
>
>         What is the basic purpose of membarrier?
>
> Is the purpose of it to provide memory barriers, or is it to provide
> memory coherence?
>
> If it's the former and not the latter, then cache flushes are out of
> scope, and expecting memory written to be visible to the instruction
> stream is totally out of scope of the membarrier interface, whether
> or not the writes happen on the same or a different CPU to the one
> executing the rewritten code.
>
> The documentation in the kernel does not seem to describe what it's
> supposed to be doing - the only thing I could find is this:
> Documentation/features/sched/membarrier-sync-core/arch-support.txt
> which describes it as "arch supports core serializing membarrier"
> whatever that means.
>
> Seems to be the standard and usual case of utterly poor to non-existent
> documentation within the kernel tree, or even a pointer to where any
> useful documentation can be found.
>
> Reading the membarrier(2) man page, I find nothing in there that talks
> about any kind of cache coherency for self-modifying code - it only
> seems to be about _barriers_ and nothing more, and barriers alone do
> precisely nothing to save you from non-coherent Harvard caches.
>
> So, either Andy has a misunderstanding, or the man page is wrong, or
> my rudimentary understanding of what membarrier is supposed to be
> doing is wrong...

Look at the latest man page:

https://man7.org/linux/man-pages/man2/membarrier.2.html

for MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.  The result may not be
all that enlightening.

--Andy



More information about the linux-arm-kernel mailing list