[PATCH] ath10k: double check bmi xfer pointers

Fri Apr 11 00:58:54 PDT 2014

On 11 April 2014 07:47, Michal Kazior <michal.kazior at tieto.com> wrote:
> On 11 April 2014 07:40, Kalle Valo <kvalo at qca.qualcomm.com> wrote:
>> Michal Kazior <michal.kazior at tieto.com> writes:
>>
>>> If for some reason copy engine ring buffer became
>>> corrupt ath10k could crash the machine due to
>>> invalid pointer dereference. It's very unlikely
>>> but devices can never be fully trusted so verify
>>> if the bmi xfer pointer read back from copy engine
>>> matches the original pointer.
>>
>> The big question is why does this happen? Does this happen only with
>> Ben's firmware or is it a more generic problem?
>
> I'll look more into this.

Hmm.. After going through the code a few times I think the bug is
actually something else.

If the crash happened in complete() it means the completion structure
wasn't set up properly. However it is always initialized in
ath10k_pci_hif_exchange_bmi_msg() before proceeding. This means xfer
pointer read back from ath10k_ce_completed_send_next() or
ath10k_ce_completed_recv_next() is wrong. Since the pointer to it is
kept on host getting wrong xfer means sw_index must be wrong. If I
assume indexes are managed correctly (i.e. no overflows, locking is
fine) then it is the entry the sw_index points to that is actually
unexpected.

This could happen if concurrent transfers occur on pipe 0 or 1 (used
for bmi communication) during device bootup. This could be either a
concurrent bmi transfer or a non-bmi buffer or an old bmi xfer data.
The latter shouldn't be the case because
ath10k_pci_hif_exchange_bmi_msg() cleans up after itself. Concurrent
access doesn't seem to be the case either.

I conclude the bug is not present in the vanilla ath10k code.

TL;DR drop the patch, please

Michał