[PATCH] dmaengine: xilinx_dma: Fix CPU stall in xilinx_dma_poll_timeout

Gupta, Suraj suraj.gupta2 at amd.com
Wed Apr 1 04:15:27 PDT 2026



On 4/1/2026 1:57 PM, Alex Bereza wrote:
> [You don't often get email from alex at bereza.email. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> On Wed Apr 1, 2026 at 7:23 AM CEST, Suraj Gupta wrote:
> 
>>> Rename XILINX_DMA_LOOP_COUNT to XILINX_DMA_POLL_TIMEOUT_US because the
>>> former is incorrect. It is a timeout value for polling various register
>>> bits in microseconds. It is not a loop count. Add a constant
>>> XILINX_DMA_POLL_DELAY_US for delay_us value.
>>
>> Please split this change in a new patch.
> 
> Ok, will send a v2.
> 
>>> Fixes: 7349a69cf312 ("iopoll: Do not use timekeeping in read_poll_timeout_atomic()")
>>
>> This patch doesn't fixes anything in iopoll, please use correct fixes tag.
> 
> Ok, but I'm not sure what would be the correct fixes tag then? I though I need to reference
> 7349a69cf312 in fixes tag because this is the actual change that surfaced the CPU stall issue that I
> want to fix in this driver. I'm fixing the call sites of xilinx_dma_poll_timeout but they were added
> in different commits. Should I add all of them? That would be the following then:
> 
> Fixes: 9495f2648287 ("dmaengine: xilinx_vdma: Use readl_poll_timeout instead of do while loop's")
> Fixes: 676f9c26c330 ("dmaengine: xilinx: fix device_terminate_all() callback for AXI CDMA")
> 
> Three call sites with delay_us=0 were first introduced by 9495f2648287, then 676f9c26c330 added the
> fourth call site when introducing xilinx_cdma_stop_transfer (probably copy paste from
> xilinx_dma_stop_transfer). Would adding these two fixes tags be correct?
> 
>>> Hi, in addition to this patch I also have a question: what is the point
>>> of atomically polling for the HALTED or IDLE bit in the stop_transfer
>>> functions? Does device_terminate_all really need to be callable from
>>> atomic context? If not, one could switch to polling non-atomically and
>>> avoid burning CPU cycles.
>>>
>>
>> dmaengine_terminate_async(), which directly calls device_terminate_all
>> can be called from atomic context.
> 
> Right, thanks! Just for my understanding: I still think there is potential for improvement, because
> from my understanding it would be beneficial to do the waiting for the bits in the status register
> and the freeing of descriptors in xilinx_dma_synchronize. Do I understand correctly that this is
> currently not possible due to how the DMA engine API is structured? To make this possible I think
> the deprecated dmaengine_terminate_all would have to be removed and all users of this API would have
> to be adapted accordingly, correct? So this would be a patch of much larger scope than xilinx_dma
> driver alone.

Yes, your understanding of xilinx_dma_synchronize()and proposed changes 
looks correct.

Regards,
Suraj



More information about the linux-arm-kernel mailing list