[RFC] Describing arbitrary bus mastering relationships in DT

Fri May 9 03:56:38 PDT 2014

On Fri, May 02, 2014 at 09:06:43PM +0200, Arnd Bergmann wrote:
> On Friday 02 May 2014 12:50:17 Stephen Warren wrote:
> > On 05/02/2014 07:23 AM, Thierry Reding wrote:
> > > On Fri, May 02, 2014 at 02:32:08PM +0200, Arnd Bergmann wrote:
> > >> On Friday 02 May 2014 13:05:58 Thierry Reding wrote:
> > >>>
> > >>> Let me see if I understood the above proposal by trying to translate it
> > >>> into a simple example for a specific use-case. On Tegra for example we
> > >>> have various units that can either access system memory directly or use
> > >>> the IOMMU to translate accesses for them. One such unit would be the
> > >>> display controller that scans out a framebuffer from memory.
> > >>
> > >> Can you explain how the decision is made whether the IOMMU gets used
> > >> or not? In all cases I've seen so far, I think we can hardwire this
> > >> in DT, and only expose one or the other. Are both ways used
> > >> concurrently?
> > > 
> > > It should be possible to hardcode this in DT for Tegra. As I understand
> > > it, both interfaces can't be used at the same time. Once translation has
> > > been enabled for one client, all accesses generated by that client will
> > > be translated.
> > > 
> > > Hiroshi, please correct me if I'm wrong.
> > 
> > I believe the HW connectivity is always as follows:
> > 
> > Bus master (e.g. display controller) ---> IOMMU (Tegra SMMU) ---> RAM
> > 
> > In the IOMMU, there is a bit per bus master that indicates whether the
> > IOMMU translates the bus master's accesses or not. If that bit is
> > enabled, then page tables in the IOMMU are used to perform the translation.
> > 
> > You could also look at the HW setup as:
> > 
> > Bus master (e.g. display controller)
> >     v
> >    ----
> >   /    \
> >   ------
> >    |  \
> >    |   ------------------
> >    |                     \
> >    v                     v
> > IOMMU (Tegra SMMU) ---> RAM
> > 
> > But IIRC the bit that controls that demux is in the IOMMU, so this
> > distinction probably isn't relevant.
> 
> Ok. I think this case can be dealt with easily enough without
> having to represent it as two master ports on one device. There
> really is just one master, and it can be configured in two ways.

I think in this case, this is effectively a "bypass" control in
the IOMMU, so it can be treated as part of the IOMMU.

Whether any real system will require us to describe dynamic forking
is unclear.

By "dynamic forking" I mean where a transaction really does flow down
different paths or through different sets of components based on more
than just the destination address, possibly under runtime control.

> 
> We can either choose to make the DT representation decide which
> way is used, or we can always point to the IOMMU, and let the
> IOMMU driver decide.
> 
> > Now, perhaps there are devices which themselves control whether
> > transactions are sent to the IOMMU or direct to RAM, but I'm not
> > familiar with them. Is the GPU in that category, since it has its own
> > GMMU, albeit chained into the SMMU IIRC?
> 
> Devices with a built-in IOMMU such as most GPUs are also easy enough
> to handle: There is no reason to actually show the IOMMU in DT and
> we can just treat the GPU as a black box.

It's impossible for such a built-in IOMMU to be shared with other
devices, so that's probably reasonable.

For an IOMMU out in the interconnect, the OS needs to understand
that it is shared, so that it knows not to mess up other flows
when poking the IOMMU.

Cheers
---Dave

> 
> Note that you don't really have to support the dma-mapping.h API on
> GPUs, they usually need to go down to the IOMMU level anyway.