ARM: bcm2835: DMA driver + spi_optimize_message - some questions/info

Mark Brown broonie at
Wed Apr 2 14:15:47 EDT 2014

On Wed, Apr 02, 2014 at 06:06:11PM +0200, Martin Sperl wrote:

Just a general meta comment: your mail is over 700 lines long, that's
way too long.  Please split things up into smaller chunks or take a
higher level view as something this long is just asking far too much of
the reader especially without work to structure the text.  Things like
splitting out the device specifics help a lot.

I've just skimmed through this for now, I'll try to find time to read in
more detail but there's a backlog of patches I need to get to.

> * right now I am not really using dma-engine, as it is too CPU intensive
>   setting a complex SPI message up (a typical read+write SPI message takes
>   typically 21 DMA transfers)
> * instead I started a new dma-fragment interface that has:

This needs to be resolved for mainlining, if dmaengine is not up to the
job we need to either decide to replace it wholesale or extend it (more
likely the latter) rather than just doing something custom.  I do see
the need to improve the DMA APIs here but that needs to be done in a way
that's joined up with other work on DMA.

> * there is some provision to see if an spi message contains is of a
>   single/double transfer type and in those cases take an "optimized"
>   spi_message (done by the driver) instead and fill it with the data
>   from the submitted spi_message and make use of all those "vary" cases.
>   this would reduce the DMA creation cost for those synchronous transfers
>   (spi_write_then_read, spi_write, spi_read), but this is not fully in place

I don't know what a "single/double transfer type" is.

> There is also one question here with regards to semantics:
> When do we need to call the "complete" method?
> When we have received all the data or when the CS has been de-asserted
> (after the delay_usec)?

Users might reasonably assume that the transfer has finished at the time
any callback runs and that has to include the /CS change otherwise the
device might not have reacted.

> So the questions are:
> Are there any major questions on this design?
> Does it look acceptable from a design perspective?
> Is it generic enough that it might get used with other devices as well?

I was having a bit of a wood from the trees problem following the design
here but broadly there does seem to be a reasonable amount of usefulness
and reusability here.  A lot of systems are going to struggle to use the
full feature set due to restrictions on things like minimum DMA transfer
sizes and not being able to DMA to control registers and like I say the
DMA API issues need to be addressed.  Trying to do chip selects with DMA
is also going to be a lot of fun in the general case, especially with

I also suspect that there is a cutoff point where PIO makes sense even
if we can DMA.

> Do we need all those "VARY" cases or would we need to vary by more fields?

The main things are probably replacing and changing the data.

> I still think that the optimize/unoptimize_message methods are quite similar
> to prepare/unprepare_message so the question is if these can not get merged
> into a single pair of methods.

Drivers are currently using prepare_message() for actual hardware setup
specific to the message that's done around actual transmission whereas
the optimisation can be done very distant to the actual use of the

> Also the question is how I should initially submit those "generic" components.
> Should I create a helper.h and helper.c with those components, so that they
> can get split out later or should I keep them separate from the start.

I'm not sure what those helper bits are but in general keeping things so
that they can be reused is best.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <>

More information about the linux-rpi-kernel mailing list