[PATCH 0/1] RFC: memory coherency and data races on TX path

Daniel Mack daniel at zonque.org
Tue Apr 10 23:52:27 PDT 2018


Hi Ramon,

On Tuesday, April 10, 2018 08:11 PM, Ramon Fried wrote:
> On 10 April 2018 at 20:42, Daniel Mack <daniel at zonque.org> wrote:
>> I found something that I believe might be an issue, and I have an
>> idea on how to correct this, but unfortunately, this doesn't solve
>> the issues I occasionally see with this driver. I'd still like to
>> share it, because I might be totally mistaken in my understanding.
>
> Thanks for sharing you idea. Are you aware of the recent patch I sent
> that Loic Poulain
> wrote that fixes a race issue in access to wcn36xx_dxe_tx_frame()  ?
> Kalle just recently applied it to ath-next, I don't think it's
> available yet upstream.

Yes, my patch builds upon yours.

> The patch was fixing something similar, perhaps it will solve the
> issue you're experiencing.

I'm not even sure what kind of effect I'm hunting, so it's hard to tell.
Your patch definitely addresses a data race too though.

>> There's a problem with the latter presumption however which looks like
>> this in the driver code:
>>
>>         desc->fr_len = ctl->skb->len;
>>         /* set dxe descriptor to VALID */
>>         desc->ctrl = ch->ctrl_skb;
>
> The firmware won't start reading the descriptors until it receives an
> interrupt from driver.
> which in turn happen later in the function using: wcn36xx_dxe_write_register()
> so I don't think reordering in this case make any problem.

I understand, but as I said, there's definitely the problem that the
channel is already running when wcn36xx_dxe_tx_frame() is entered. Try
adding this at the beginning of the the function and then run iperf:

	int reg;
	wcn36xx_dxe_read_register(wcn, ch->reg_ctrl, &reg);
	WARN_ON(reg & WCN36xx_DXE_CH_CTRL_EN);

I fail to see how the firmware would determine which descriptors to look
at without any type of synchronization mechanism.

>> Does that make sense? As I said, I can't really say this improves
>> anything, sadly, so I might be mistaken entirely. But I'll leave this
>> here for further discussion. Ideally, somebody with access to the
>> firmware sources could give an assessment whether this is an issue at
>> all or not.
>
> When I'm not sure regarding some implementation I consult with the
> downstream pronto driver.
> https://github.com/sultanqasim/qcom_wlan_prima
> 
> Did you look there if they actually placed wmb() ?

Yes, I've read some of that as well. They use wmb() before writing to
and rmb() after reading firmware registers. The equivalent upstream is
wcn36xx_dxe_[read,write}_register(), and they use writel() and readl()
which have the same barriers on arm64. So that's fine.

What's interesting though is that the downstream drivers sets the VLD
bit of the first descriptor of the chain *after* they set all the
others. There are also some confusing comments about that (/* First
frame not set VAL bit, why ??? */). Can you make sense of that?


Thanks,
Daniel



More information about the wcn36xx mailing list