Bug 119151 - [regression] ath10k no longer authenitcates and freezes system

Ben Greear greearb at candelatech.com
Thu Jun 2 11:02:47 PDT 2016

On 06/02/2016 10:41 AM, Rajkumar Manoharan wrote:
> On 2016-06-02 22:53, Ben Greear wrote:
>> On 06/02/2016 10:03 AM, Manoharan, Rajkumar wrote:
>>> On Thursday, June 2, 2016 8:51 PM, Ben Greear <greearb at candelatech.com> wrote:
>>>> On 06/02/2016 07:24 AM, Valo, Kalle wrote:
>>>>> Kalle Valo <kvalo at qca.qualcomm.com> writes:
>>>>>> there's a regression in ath10k:
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=119151
>>>>>> Reporter bisected it to this:
>>>>>> 5c86d97bcc1d42ce7f75685a61be4dad34ee8183 is the first bad commit
>>>>>> commit 5c86d97bcc1d42ce7f75685a61be4dad34ee8183
>>>>>> Author: Rajkumar Manoharan <rmanohar at qti.qualcomm.com>
>>>>>> Date:   Tue Mar 22 17:22:19 2016 +0530
>>>>>> ath10k: combine txrx and replenish task
> [...]
>>>> I found a lot of problems with this code as well, and the 5 patches
>>>> starting from the URL below fixed the issues for me.
>>> Ben,
>>> Can you please explain the sort of issues you have observed with this change?
>> I imported a bunch of upstream patches at once, so not sure exactly what commit
>> caused it.  And, this was about 2 months ago...  Upon review, I'm not
>> sure I even have
>> the patch this particular bug was bisected to, so maybe that is some
>> other issue.
> Please keep track of buggy commit and report them asap.

I posted to the list at the time.  When I was debugging this, there
were so many conflicting issues that it was hard to find a single
regression point.

>> But, the problems I saw were deadlocks and memory corruption.  A lot of it was
>> because I was debugging new firmware at the time and so peer creation
>> was failing
>> sometimes, and things like that.  The error handling in ath10k for this was
>> faulty and racy and such.  We have not seen any performance regressions,
>> but we mostly run on very powerful CPUs.
>> Please take a look at those 5 patches.  A good review would be much appreciated,
>> and by reading them you will better be able to see the problems I was hitting
>> and trying to fix.
> Below two patches are critical and I already shared my feedback.
> https://patchwork.kernel.org/patch/8727841/
> https://patchwork.kernel.org/patch/9073471/
> Others are LGTM.

Not sure what LGTM means.

This one fixes memory corruption:

This one fixes use-after-free memory bugs:

As does this one:

>> In case you want to look at the full context of those patches, you can find
>> them here (around 24 patches down from the top...)
> Quite a big list :)
>> http://dmz2.candelatech.com/?p=linux-4.4.dev.y/.git;a=summary
>> For now, I am sticking with 4.4 + what I pulled in, but will rebase
>> against upstream someday
>> soon-ish and then we can start testing it all over again :)
> Will go through the list. Better to post them to public if not.

Many of these patches are related to features only in my firmware.  The ~20
patch patch-bomb was a start at adding some of the hopefully less controversial
support.  If I can ever get that upstream, then I will pick off another
set of patches and try to get them ready for upstream.


Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

More information about the ath10k mailing list