[RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
Ohad Ben-Cohen
ohad at wizery.com
Fri Dec 23 06:35:26 EST 2011
On Fri, Dec 23, 2011 at 11:35 AM, Linus Torvalds
<torvalds at linux-foundation.org> wrote:
> On x86, there really is never any reason to use the heavy memory
> barriers unless you are talking to a real device. And last I saw,
> "virtio" was still about virtual IO.
I reported this originally, so maybe I should describe our use case a
bit here (it's not virtio-mmio). It's a bit long, so apologizes in
advance.
Almost every SoC today have several additional cores (DSP or whatnot)
which usually employ some hardware multimedia accelerators and are
used to offload cpu-intensive tasks from the main application
processor. These other cores are used in an asymmetric multiprocessing
configurations, i.e. they run their own instance of operating system
(which can be some flavor of RTOS, or Linux, or whatnot. anything
goes).
Virtually every SoC vendor have this (lots of ARM vendors, but it's
definitely not limited to ARM), and they all have their own way of
controlling, and communicating with, those remote cores. And it's
usually rather big (tens of thousands loc), out-of-tree and very
vendor-specific code.
So we're trying to fix this by introducing some generic code that'd
control those remote cores and let drivers talk to them, which all
vendors could then use. I'll be sending you a 3.3 pull request for
this, but you can already take a look in linux-next at drivers/rpmsg
(inter-processor communication bus) and drivers/remoteproc (framework
for booting a remote core).
And rpmsg is using virtio to avoid implementing another shared memory
"wire" protocol. And of course to be able to reuse all the existing
virtio drivers (e.g. net, block, console) with a remote core backend.
Which leads me to the specific issue we have. On OMAP4, the virtio
kick is implemented using a memory-mapped mailbox device. After
updating a vring (which is mapped using ARM's Normal memory) and
before kicking the remote core (using the mailbox device which is
mapped using ARM's Device memory) we must use a "heavy" memory barrier
(i.e. ARM's DSB). Otherwise, if only an smp memory barrier is used
(i.e. DMB on ARM), the kick might jump ahead before the remote core
has observed the updates to the vrings. And then bad things happen.
We didn't want to inflict the performance degradation on the
virtualization use cases (which can run concurrently with the remote
core scenarios), hence the dynamic "IO or SMP barrier" thingy.
(Btw we don't have enough information about other
setups/configurations as other vendors just begin using virtio for
these kind of scenarios, but we guess this is probably isn't limited
to OMAP. Fixing it at the transport layer sounds reasonable, although
there are other ways to do this too).
Hope this makes sense,
Thanks,
Ohad.
More information about the linux-arm-kernel
mailing list