[PATCH v3 2/4] dmaengine: Add STM32 MDMA driver

Wed Aug 23 09:00:45 PDT 2017

On Tue, Aug 22, 2017 at 05:59:26PM +0200, Pierre Yves MORDRET wrote:
> 
> 
> On 08/16/2017 06:47 PM, Vinod Koul wrote:
> > On Wed, Jul 26, 2017 at 11:48:20AM +0200, Pierre-Yves MORDRET wrote:
> > 
> >> +/* MDMA Channel x transfer configuration register */
> >> +#define STM32_MDMA_CTCR(x)		(0x50 + 0x40 * (x))
> >> +#define STM32_MDMA_CTCR_BWM		BIT(31)
> >> +#define STM32_MDMA_CTCR_SWRM		BIT(30)
> >> +#define STM32_MDMA_CTCR_TRGM_MSK	GENMASK(29, 28)
> >> +#define STM32_MDMA_CTCR_TRGM(n)		(((n) & 0x3) << 28)
> >> +#define STM32_MDMA_CTCR_TRGM_GET(n)	(((n) & STM32_MDMA_CTCR_TRGM_MSK) >> 28)
> > 
> > OK this seems oft repeated here.
> > 
> > So you are trying to extract the bit values and set the bit value, so why
> > not this do generically...
> > 
> > #define STM32_MDMA_SHIFT(n)		(ffs(n) - 1))
> > #define STM32_MDMA_SET(n, mask)		((n) << STM32_MDMA_SHIFT(mask))
> > #define STM32_MDMA_GET(n, mask)		(((n) && mask) >> STM32_MDMA_SHIFT(mask))
> > 
> > Basically, u extract the shift using the mask value and ffs helping out, so
> > no need to define these and reduce chances of coding errors...
> > 
> 
> OK.
> but I would prefer if you don't mind

hmmm, I don't have a very strong opinion, so okay. But from programming PoV
it reduces human errors..

> #define STM32_MDMA_SET(n, mask)		(((n) << STM32_MDMA_SHIFT(mask)) & mask)
> 
> >> +static int stm32_mdma_get_width(struct stm32_mdma_chan *chan,
> >> +				enum dma_slave_buswidth width)
> >> +{
> >> +	switch (width) {
> >> +	case DMA_SLAVE_BUSWIDTH_1_BYTE:
> >> +	case DMA_SLAVE_BUSWIDTH_2_BYTES:
> >> +	case DMA_SLAVE_BUSWIDTH_4_BYTES:
> >> +	case DMA_SLAVE_BUSWIDTH_8_BYTES:
> >> +		return ffs(width) - 1;
> >> +	default:
> >> +		dev_err(chan2dev(chan), "Dma bus width not supported\n");
> > 
> > please log the width here, helps in debug...
> > 
> 
> Hum.. just a dev_dbg to log the actual width or within the dev_err ?

latter pls

> 
> >> +static u32 stm32_mdma_get_best_burst(u32 buf_len, u32 tlen, u32 max_burst,
> >> +				     enum dma_slave_buswidth width)
> >> +{
> >> +	u32 best_burst = max_burst;
> >> +	u32 burst_len = best_burst * width;
> >> +
> >> +	while ((burst_len > 0) && (tlen % burst_len)) {
> >> +		best_burst = best_burst >> 1;
> >> +		burst_len = best_burst * width;
> >> +	}
> >> +
> >> +	return (best_burst > 0) ? best_burst : 1;
> > 
> > when would best_burst <= 0? DO we really need this check
> > 
> > 
> 
> best_burst < 0 is obviously unlikely but =0 is likely whether no best burst
> found. Se we do need this check.
> 
> >> +static struct dma_async_tx_descriptor *
> >> +stm32_mdma_prep_dma_cyclic(struct dma_chan *c, dma_addr_t buf_addr,
> >> +			   size_t buf_len, size_t period_len,
> >> +			   enum dma_transfer_direction direction,
> >> +			   unsigned long flags)
> >> +{
> >> +	struct stm32_mdma_chan *chan = to_stm32_mdma_chan(c);
> >> +	struct stm32_mdma_device *dmadev = stm32_mdma_get_dev(chan);
> >> +	struct dma_slave_config *dma_config = &chan->dma_config;
> >> +	struct stm32_mdma_desc *desc;
> >> +	dma_addr_t src_addr, dst_addr;
> >> +	u32 ccr, ctcr, ctbr, count;
> >> +	int i, ret;
> >> +
> >> +	if (!buf_len || !period_len || period_len > STM32_MDMA_MAX_BLOCK_LEN) {
> >> +		dev_err(chan2dev(chan), "Invalid buffer/period len\n");
> >> +		return NULL;
> >> +	}
> >> +
> >> +	if (buf_len % period_len) {
> >> +		dev_err(chan2dev(chan), "buf_len not multiple of period_len\n");
> >> +		return NULL;
> >> +	}
> >> +
> >> +	/*
> >> +	 * We allow to take more number of requests till DMA is
> >> +	 * not started. The driver will loop over all requests.
> >> +	 * Once DMA is started then new requests can be queued only after
> >> +	 * terminating the DMA.
> >> +	 */
> >> +	if (chan->busy) {
> >> +		dev_err(chan2dev(chan), "Request not allowed when dma busy\n");
> >> +		return NULL;
> >> +	}
> > 
> > is that a HW restriction? Once a txn is completed can't we submit
> > subsequent txn..? Can you explain this part please.
> > 
> 
> Driver can prepare any request Slave SG, Memcpy or Cyclic. But if the channel is
> busy to complete a DMA transfer, the request will be put in pending list. This
> is only when the DMA transfer is going to be completed the next descriptor is
> going to be processed and started.
> However for cyclic this is different since when cyclic is ignited the channel
> will be busy until its termination. This is why we forbid any DMA preparation
> for this channel.
> Nonetheless I believe we have a flaw here since we have to forbid
> Slave/Memcpy/Cyclic whether a cyclic request is on-going.

But you are not submitting a txn to HW. The prepare_xxx operation prepares a
descriptor which is pushed to pending queue on submit and further pushed to
hw on queue move or issue_pending()

So here we should ideally accept the request.

After you finish memcpy you can submit a memcpy right...?

> 
> 
> >> +	if (len <= STM32_MDMA_MAX_BLOCK_LEN) {
> >> +		cbndtr |= STM32_MDMA_CBNDTR_BNDT(len);
> >> +		if (len <= STM32_MDMA_MAX_BUF_LEN) {
> >> +			/* Setup a buffer transfer */
> >> +			tlen = len;
> >> +			ccr |= STM32_MDMA_CCR_TCIE | STM32_MDMA_CCR_CTCIE;
> >> +			ctcr |= STM32_MDMA_CTCR_TRGM(STM32_MDMA_BUFFER);
> >> +			ctcr |= STM32_MDMA_CTCR_TLEN((tlen - 1));
> >> +		} else {
> >> +			/* Setup a block transfer */
> >> +			tlen = STM32_MDMA_MAX_BUF_LEN;
> >> +			ccr |= STM32_MDMA_CCR_BTIE | STM32_MDMA_CCR_CTCIE;
> >> +			ctcr |= STM32_MDMA_CTCR_TRGM(STM32_MDMA_BLOCK);
> >> +			ctcr |= STM32_MDMA_CTCR_TLEN(tlen - 1);
> >> +		}
> >> +
> >> +		/* Set best burst size */
> >> +		max_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
> > 
> > that maynot be best.. we should have wider and longer burst for best
> > throughput..
> > 
> 
> Will look at that.
> 
> >> +	ret = device_property_read_u32(&pdev->dev, "dma-requests",
> >> +				       &nr_requests);
> >> +	if (ret) {
> >> +		nr_requests = STM32_MDMA_MAX_REQUESTS;
> >> +		dev_warn(&pdev->dev, "MDMA defaulting on %i request lines\n",
> >> +			 nr_requests);
> >> +	}
> >> +
> >> +	count = of_property_count_u32_elems(of_node, "st,ahb-addr-masks");
> > 
> > We dont have device_property_xxx for this?
> 
> Sorry no. Well didn't figure out one though.

we do :) the array property with NULL argument returns the size of array..

int device_property_read_u32_array(struct device *dev, const char *propname,
				   u32 *val, size_t nval)

Documentation says:
 Return: number of values if @val was %NULL,

> 
> > 
> >> +	if (count < 0)
> >> +		count = 0;
> >> +
> >> +	dmadev = devm_kzalloc(&pdev->dev, sizeof(*dmadev) + sizeof(u32) * count,
> >> +			      GFP_KERNEL);
> >> +	if (!dmadev)
> >> +		return -ENOMEM;
> >> +
> >> +	dmadev->nr_channels = nr_channels;
> >> +	dmadev->nr_requests = nr_requests;
> >> +	of_property_read_u32_array(of_node, "st,ahb-addr-masks",
> >> +				   dmadev->ahb_addr_masks,
> >> +				   count);
> > 
> > i know we have an device api for array reads :)
> > and I think that helps in former case..
> > 
> 
> Correct :) device_property_read_u32_array

yes..

-- 
~Vinod