[PATCH v8 2/3] dmaengine: amlogic: Add general DMA driver for A9

Thu May 21 02:00:04 PDT 2026

Thank you for your contribution! Sashiko AI review found 17 potential issue(s) to consider:
- [Critical] The naive ring buffer allocator overflows on large scatterlists and silently overwrites active in-flight DMA descriptors.
- [High] The DMA engine device is unregistered twice on driver removal and probe error paths.
- [High] The DMA controller is exposed to clients before hardware and interrupts are fully initialized.
- [High] Integer truncation in the calculation of the allocation length for `aml_dma_dev` leads to a heap buffer overflow.
- [High] The driver allocates memory with `GFP_KERNEL` inside a DMA preparation callback, which can sleep in atomic context.
- [High] The driver unconditionally dereferences `cur_desc` in the interrupt handler and `terminate_all`, leading to a kernel panic if the channel is idle.
- [High] A single DMA transfer error permanently deadlocks the channel.
- [High] Calling `aml_dma_prep_slave_sg` with an empty scatterlist results in a buffer underflow write.
- [High] Mismatch between allocated links and written links for zero-length scatterlist elements.
- [High] Missing `tasklet_kill()` during driver removal leads to Use-After-Free.
- [High] For WCH (DEV_TO_MEM) channels, the hardware is never unpaused after `terminate_all()` or `pause()`.
- [High] The 64-bit physical address of the DMA descriptor list is truncated to 32 bits, causing DMA faults.
- [Medium] Incorrect and non-compliant descriptor residue calculation in `aml_dma_tx_status`.
- [Medium] Signed integer overflow due to 32-bit shift in the interrupt handler.
- [Medium] Concurrent modification of `aml_chan->status` without locking causes state corruption.
- [Medium] The shared interrupt handler unconditionally returns `IRQ_HANDLED` even when it processes no events.
- [Medium] Hardware descriptor fields use native endianness and unaligned 64-bit access.
--

commit 6328a62df7a031573dae5a113f1f4a5b08fd7d50
Author: Xianwei Zhao <xianwei.zhao at amlogic.com>
Subject: dmaengine: amlogic: Add general DMA driver for A9

This patch adds a new DMA engine driver for the Amlogic A9 general-purpose
DMA controller. It handles transfer operations using the virt-dma framework
and registers the controller using the dmaengine API.

> diff --git a/drivers/dma/amlogic-dma.c b/drivers/dma/amlogic-dma.c
> new file mode 100644
> index 0000000000000..7dae5ba15c7e7
> --- /dev/null
> +++ b/drivers/dma/amlogic-dma.c
> @@ -0,0 +1,682 @@
[ ... ]
> +struct aml_dma_sg_link {
> +#define LINK_LEN		GENMASK(26, 0)
> +#define LINK_IRQ		BIT(27)
> +#define LINK_EOC		BIT(28)
> +#define LINK_LOOP		BIT(29)
> +#define LINK_ERR		BIT(30)
> +#define LINK_OWNER		BIT(31)
> +	u32 ctl;
> +	u64 address;
> +	u32 revered;
> +} __packed;

[Severity: Medium]
Will this __packed structure cause alignment faults on architectures sensitive
to unaligned memory accesses? The 64-bit address field is forced to an
unaligned 4-byte offset. Are the native types (u32, u64) safe here, or could
they break DMA execution on big-endian kernels?

[ ... ]
> +static enum dma_status aml_dma_tx_status(struct dma_chan *chan,
> +					 dma_cookie_t cookie,
> +					 struct dma_tx_state *txstate)
> +{
[ ... ]
> +	spin_lock_irqsave(&aml_chan->vchan.lock, flags);
> +	vd = vchan_find_desc(&aml_chan->vchan, cookie);
> +	if (vd) {
> +		list_for_each_entry(vd, &aml_chan->vchan.desc_issued, node) {
> +			aml_desc = to_aml_dma_desc(vd);
> +			residue += aml_desc->data_len;
> +			if (vd->tx.cookie == cookie)
> +				break;
> +		}

[Severity: Medium]
Does this incorrectly accumulate the data_len of independent descriptors?
Furthermore, since list_for_each_entry() overwrites the loop cursor 'vd', could
this traverse the entire desc_issued list without breaking if the target
cookie resides in desc_submitted, thereby returning an incorrect total?

[ ... ]
> +static int find_dma_chan_link(struct aml_dma_chan *aml_chan, u32 num)
> +{
> +	int idx;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&aml_chan->vchan.lock, flags);
> +	if ((aml_chan->idx_next + num) >= DMA_MAX_LINK)
> +		idx = 0;
> +	else
> +		idx = aml_chan->idx_next;
> +	aml_chan->idx_next = idx + num;
> +	spin_unlock_irqrestore(&aml_chan->vchan.lock, flags);
> +
> +	return idx;
> +}

[Severity: Critical]
Is it possible for 'num' to be larger than DMA_MAX_LINK (256)? If a scatterlist
needs more than 256 links, could idx_next become out of bounds and cause the
descriptor preparation loop to write past the end of the sg_link array? Could
this also silently overwrite active in-flight descriptors without checking if
they are currently being processed?

[ ... ]
> +static struct dma_async_tx_descriptor *aml_dma_prep_slave_sg
> +		(struct dma_chan *chan, struct scatterlist *sgl,
> +		unsigned int sg_len, enum dma_transfer_direction direction,
> +		unsigned long flags, void *context)
> +{
[ ... ]
> +	aml_desc = kzalloc(sizeof(*aml_desc), GFP_KERNEL);

[Severity: High]
Can this allocation sleep in atomic context? Dmaengine prep callbacks are
expected to be callable from atomic contexts (e.g. inside interrupt handlers or
while holding spinlocks). Should GFP_NOWAIT or GFP_ATOMIC be used instead?

> +	if (!aml_desc)
> +		return NULL;
> +	link_count = sg_nents_for_dma(sgl, sg_len, SG_MAX_LEN);
> +	aml_desc->idx = find_dma_chan_link(aml_chan, link_count);
> +	sg_link = aml_chan->sg_link + aml_desc->idx;
> +	for_each_sg(sgl, sg, sg_len, i) {
> +		avail = sg_dma_len(sg);
> +		paddr = sg->dma_address;
> +		while (avail > SG_MAX_LEN) {
> +			/* set dma address and len to sglink*/
> +			sg_link->address = paddr;
> +			sg_link->ctl = FIELD_PREP(LINK_LEN, SG_MAX_LEN);
> +			paddr = paddr + SG_MAX_LEN;
> +			avail = avail - SG_MAX_LEN;
> +			sg_link++;
> +		}
> +		/* set dma address and len to sglink*/
> +		sg_link->address = paddr;
> +		sg_link->ctl = FIELD_PREP(LINK_LEN, avail);
> +
> +		aml_desc->data_len += sg_dma_len(sg);
> +		sg_link++;
> +	}

[Severity: High]
If a scatterlist element has a length of 0, sg_nents_for_dma() will not allocate
a link slot for it. Will this loop unconditionally increment sg_link++ for that
element anyway, causing the driver to consume more link slots than allocated and
corrupt the ring buffer?

> +
> +	/* the last sg set eoc flag */
> +	sg_link--;
> +	sg_link->ctl |= LINK_EOC;

[Severity: High]
If aml_dma_prep_slave_sg() is called with sg_len == 0, the for_each_sg loop
will not execute. Could this cause sg_link-- to point before the allocated
descriptor index, leading to an out-of-bounds underflow write?

[ ... ]
> +static int aml_dma_chan_pause(struct dma_chan *chan)
> +{
> +	struct aml_dma_chan *aml_chan = to_aml_dma_chan(chan);
> +	struct aml_dma_dev *aml_dma = aml_chan->aml_dma;
> +
> +	regmap_set_bits(aml_dma->regmap, aml_chan->reg_offs + RCH_CFG, CFG_PAUSE);
> +	aml_chan->pre_status = aml_chan->status;
> +	aml_chan->status = DMA_PAUSED;

[Severity: Medium]
Are these status updates subject to race conditions? aml_chan->status is
modified here without acquiring aml_chan->vchan.lock, while it is also modified
concurrently by the interrupt handler.

[ ... ]
> +static void aml_dma_start(struct aml_dma_chan *aml_chan)
> +{
[ ... ]
> +	if (aml_chan->direction == DMA_MEM_TO_DEV) {
> +		regmap_write(aml_dma->regmap, aml_chan->reg_offs + RCH_ADDR,
> +			     (aml_chan->sg_link_phys + aml_desc->idx * DMA_LINK_SIZE));

[Severity: High]
Does passing a 64-bit dma_addr_t (sg_link_phys) to regmap_write() silently
truncate the upper 32 bits of the address? If memory is allocated above 4GB,
will this lead to incorrect memory accesses by the DMA controller?

> +		regmap_write(aml_dma->regmap, aml_chan->reg_offs + RCH_LEN, aml_desc->data_len);
> +		regmap_clear_bits(aml_dma->regmap, RCH_INT_MASK, BIT(chan_id));
> +		/* for rch (tx) need set cfg 0 to trigger start */
> +		regmap_write(aml_dma->regmap, aml_chan->reg_offs + RCH_CFG, 0);
> +	} else if (aml_chan->direction == DMA_DEV_TO_MEM) {
> +		regmap_write(aml_dma->regmap, aml_chan->reg_offs + WCH_ADDR,
> +			     (aml_chan->sg_link_phys + aml_desc->idx * DMA_LINK_SIZE));
> +		regmap_write(aml_dma->regmap, aml_chan->reg_offs + WCH_LEN, aml_desc->data_len);
> +		regmap_clear_bits(aml_dma->regmap, WCH_INT_MASK, BIT(chan_id));
> +	}

[Severity: High]
For WCH (DEV_TO_MEM) channels, aml_dma_start() neglects to write to WCH_CFG.
If aml_dma_terminate_all() or aml_dma_chan_pause() previously set the CFG_PAUSE
bit, will WCH channels remain indefinitely paused?

> +}
[ ... ]
> +static irqreturn_t aml_dma_interrupt_handler(int irq, void *dev_id)
> +{
[ ... ]
> +	while (done) {
> +		i = ffs(done) - 1;
> +		regmap_write(aml_dma->regmap, CLEAR_RCH, BIT(i));
> +		done &= ~BIT(i);
> +		aml_chan = aml_dma->aml_rch[i];
> +		if (!aml_chan) {
> +			dev_err(aml_dma->dma_device.dev, "idx %d rch not initialized\n", i);
> +			continue;
> +		}
> +		aml_chan->status = (err & (1 << i)) ? DMA_ERROR : DMA_COMPLETE;

[Severity: Medium]
Could this shift result in a signed integer overflow? If 'i' reaches 31
(since MAX_CHAN_ID is 32), shifting the signed 32-bit literal '1' causes
undefined behavior. Should this use 1U << i or BIT(i) instead?

[Severity: High]
If a hardware error occurs and aml_chan->status is set to DMA_ERROR, is there
any mechanism to reset the status back to DMA_COMPLETE? Without a reset path,
will aml_dma_start() simply return immediately for all future transfers,
permanently deadlocking the channel?

> +		spin_lock(&aml_chan->vchan.lock);
> +		aml_desc = aml_chan->cur_desc;
> +		if (aml_chan->status == DMA_ERROR) {
> +			aml_desc->vd.tx_result.result = DMA_TRANS_READ_FAILED;
> +			regmap_read(aml_dma->regmap, aml_chan->reg_offs + RCH_RD_LEN, &cpl_data);
> +			aml_desc->vd.tx_result.residue = aml_desc->data_len - cpl_data;
> +		}

[Severity: High]
If a spurious interrupt fires, or if it fires concurrently with
aml_dma_terminate_all() (which sets cur_desc to NULL), will aml_desc be NULL
here? This appears to unconditionally dereference aml_desc, potentially causing
a NULL pointer dereference.

[ ... ]
> +	}
> +
> +	return IRQ_HANDLED;
> +}

[Severity: Medium]
Because this driver requests a shared interrupt line (IRQF_SHARED), shouldn't
the handler return IRQ_NONE if it processed no events (e.g., done and end
were both 0)? Returning IRQ_HANDLED unconditionally could break shared interrupt
semantics and cause issues with the kernel's spurious interrupt detection.

[ ... ]
> +static int aml_dma_probe(struct platform_device *pdev)
> +{
> +	struct device_node *np = pdev->dev.of_node;
> +	struct dma_device *dma_dev;
> +	struct aml_dma_dev *aml_dma;
> +	int ret, i, len;
> +	u32 chan_nr;
[ ... ]
> +	ret = of_property_read_u32(np, "dma-channels", &chan_nr);
> +	if (ret)
> +		return dev_err_probe(&pdev->dev, ret, "failed to read dma-channels\n");
> +
> +	len = sizeof(struct aml_dma_dev) + sizeof(struct aml_dma_chan) * chan_nr;
> +	aml_dma = devm_kzalloc(&pdev->dev, len, GFP_KERNEL);

[Severity: High]
Is the calculation of 'len' susceptible to an integer overflow? 'chan_nr' is
read directly from the device tree and isn't bounds-checked before being used
in the calculation of the signed integer 'len'. Could a large 'chan_nr' cause
a heap buffer overflow later during initialization?

[ ... ]
> +	ret = dmaenginem_async_device_register(dma_dev);
> +	if (ret)
> +		return dev_err_probe(&pdev->dev, ret, "failed to register dmaenginem\n");
> +
> +	ret = of_dma_controller_register(np, aml_of_dma_xlate, aml_dma);
> +	if (ret)
> +		goto err_dmaenginem_register;
> +
> +	regmap_write(aml_dma->regmap, RCH_INT_MASK, 0xffffffff);
> +	regmap_write(aml_dma->regmap, WCH_INT_MASK, 0xffffffff);
> +
> +	ret = devm_request_irq(&pdev->dev, aml_dma->irq, aml_dma_interrupt_handler,
> +			       IRQF_SHARED, dev_name(&pdev->dev), aml_dma);

[Severity: High]
Does exposing the DMA controller to clients before the interrupt is requested
leave a window where a client could queue a transfer and the hardware could
generate an unhandled interrupt? Should the interrupt be initialized before
dmaenginem_async_device_register() and of_dma_controller_register() are called?

> +	if (ret) {
> +		dev_err(&pdev->dev, "failed to request_irq\n");
> +		goto err_of_dma_controller;
> +	}
> +
> +	platform_set_drvdata(pdev, aml_dma);
> +
> +	return 0;
> +
> +err_of_dma_controller:
> +	of_dma_controller_free(np);
> +err_dmaenginem_register:
> +	dma_async_device_unregister(dma_dev);

[Severity: High]
Because dmaenginem_async_device_register() already registers a devres action
to unregister the device upon failure, does calling dma_async_device_unregister()
here manually result in a double-free or list corruption?

> +
> +	return ret;
> +}
> +
> +static void aml_dma_remove(struct platform_device *pdev)
> +{
> +	struct aml_dma_dev *aml_dma = platform_get_drvdata(pdev);
> +
> +	of_dma_controller_free((&pdev->dev)->of_node);
> +	dma_async_device_unregister(&aml_dma->dma_device);

[Severity: High]
Similarly, does calling dma_async_device_unregister() here cause a double-free
since dmaenginem_async_device_register() was used?

[Severity: High]
Is a call to tasklet_kill() needed here during removal? Since the virt-dma
framework schedules tasklets for DMA completions, failing to kill them could
lead to a use-after-free if a tasklet executes after the driver has been unloaded.

> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260521-amlogic-dma-v8-0-86cc2ce94142@amlogic.com?part=2