[PATCH 1/3] wifi: ath11k: fix dest ring-buffer corruption

Miaoqing Pan quic_miaoqing at quicinc.com
Tue Jun 3 22:32:08 PDT 2025



On 6/4/2025 10:34 AM, Miaoqing Pan wrote:
> 
> 
> On 6/3/2025 7:51 PM, Johan Hovold wrote:
>> On Tue, Jun 03, 2025 at 06:52:37PM +0800, Baochen Qiang wrote:
>>> On 6/2/2025 4:03 PM, Johan Hovold wrote:
>>
>>>> No, the barrier is needed between reading the head pointer and 
>>>> accessing
>>>> descriptor fields, that's what matters.
>>>>
>>>> You can still end up with reading stale descriptor data even when
>>>> ath11k_hal_srng_dst_get_next_entry() returns non-NULL due to 
>>>> speculation
>>>> (that's what happens on the X13s).
>>>
>>> The fact is that a dma_rmb() does not even prevent speculation, no 
>>> matter where it is
>>> placed, right?
>>
>> It prevents the speculated load from being used.
>>
>>> If so the whole point of dma_rmb() is to prevent from compiler 
>>> reordering
>>> or CPU reordering, but is it really possible?
>>>
>>> The sequence is
>>>
>>>     1# reading HP
>>>         srng->u.dst_ring.cached_hp = READ_ONCE(*srng- 
>>> >u.dst_ring.hp_addr);
>>>
>>>     2# validate HP
>>>         if (srng->u.dst_ring.tp == srng->u.dst_ring.cached_hp)
>>>             return NULL;
>>>
>>>     3# get desc
>>>         desc = srng->ring_base_vaddr + srng->u.dst_ring.tp;
>>>
>>>     4# accessing desc
>>>         ath11k_hal_desc_reo_parse_err(... desc, ...)
>>>
>>> Clearly each step depends on the results of previous steps. In this 
>>> case the compiler/CPU
>>> is expected to be smart enough to not do any reordering, isn't it?
>>
>> Steps 3 and 4 can be done speculatively before the load in step 1 is
>> complete as long as the result is discarded if it turns out not to be
>> needed.
>>
> 
> If the condition in step 2 is true and step 3 speculatively loads 
> descriptor from TP before step 1, could this cause issues?

Sorry for typo, if the condition in step 2 is false and step 3 
speculatively loads descriptor from TP before step 1, could this cause 
issues?

> 
> We previously had extensive discussions on this topic in the https:// 
> lore.kernel.org/linux-wireless/ecfe850c-b263-4bee-b888- 
> c34178e690fc at quicinc.com/ thread. On my platform, dma_rmb() did not work 
> as expected. The issue only disappeared after disabling PCIe endpoint 
> relaxed ordering in firmware side. So it seems that HP was updated 
> (Memory write) before descriptor (Memory write), which led to the problem.
> 
> 
> 
> 
> 




More information about the ath11k mailing list