[PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform
greearb at candelatech.com
Wed Jul 20 09:52:30 PDT 2016
On 07/19/2016 10:36 PM, Michal Kazior wrote:
> On 19 July 2016 at 17:25, Manoharan, Rajkumar <rmanohar at qti.qualcomm.com> wrote:
>> On June 30, 2016 12:39 PM, Michal Kazior <michal.kazior at tieto.com> wrote:
>>> On 29 June 2016 at 18:35, Manoharan, Rajkumar <rmanohar at qti.qualcomm.com> wrote:
>>>>>> Am 29.06.2016 um 16:04 schrieb Sebastian Gottschall:
>>>>>> this fix will crash QCA9980 on QCA IPQ8064 cpu based systems.
>>>>>> so please rework it, or leave it out.
>>>>>> maybe the limit of 256kb is too low for that card
>>>>> by the way. 512 works
>>> I think this suggests the problem isn't about memory chunk size limit
>>> per se but some kind of bug in address/offset logic in fw or hw.
>>> DMA coherent and single-map addresses use completely different ranges
>>> in many cases. Perhaps some MSBs are not properly handled in fw or hw.
>>> I recall there is a magic macro through which target device accesses
>>> host memory so maybe that's a good place to look to better understand
>>> the problem?
>> Could you please shed some light on this issue? It seems this issue is popping up
>> more frequently and there are multiple threads for this issue.
>> "Anyone brought up 9984 NIC on x86-64?"
>> "AR9882 IOMMU faults"
> I think IOMMU faults were solved by using DMA_BIDIRECTIONAL, no?
Yes, that resolves the faults, or at least the vast majority of them.
Remaining spurious faults are likely firmware bugs accessing bad memory, I guess.
You probably don't notice this at all on ARM and other systems w/out hardware IOMMU?
>> Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
>> to allocate smaller chunks. So If smaller chunks causes unexpected behaviour, it is even
>> applicable to existing logic. no?
> We still don't know *why* using non-coherent memory causes problems.
> Changing chunk size limit seems to alter the behavior in some
> unpredictable ways, yes, but it's really hard to tell if the "try
> smaller chunk sizes" *itself* introduces any problems.
If it is crashing firmware somehow, and you can get a backtrace, then likely
it can be debugged. In my case, changing the size caused firmware to crash due to
lame logic bug in the firmware, for instance. Possibly other crashes are as
mundane as that.
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc http://www.candelatech.com
More information about the ath10k