AER: Malformed TLP recovery deadlock with NVMe drives

okaya at codeaurora.org okaya at codeaurora.org
Mon May 7 17:21:38 PDT 2018


On 2018-05-08 00:57, Alex_Gagniuc at Dellteam.com wrote:
> On 5/7/2018 5:46 PM, okaya at codeaurora.org wrote:
> [snip]
>>> If it were easy, somebody would have patched it by now ;)
>> 
>> Can you file a bugzilla CC me, keith and bjorn and attach all of your
>> logs?
> 
> Sure. Which bugzilla?
> 

https://bugzilla.kernel.org

Drivers -> pci


> 
>> Let's debug this there.
> 

Bugzilla is more organized for keeping track of which log is for what.

My experience is that bugzilla is preferred unless Keith or Bjorn has a 
different opinion.

> Debugging over email not fun enough?
> 
> Alex
> 
> 
>>>> With this patch, you shouldn't
>>>> see link down and up interrupts during reset but i do see them in 
>>>> the
>>>> log.
>>> 
>>> You will see the messages from the link up/down events regardless if
>>> any
>>> action is actually taken.
>>> 
>>>> Can you also share a fail case log with this patch and a diff of 
>>>> your
>>>> hacks so that we know where prints are coming from.
>>> 
>>> Of course. Example of failing case [3], and is identical to the fail
>>> log
>>> without any patches. Although prints have the function name, the diff
>>> is
>>> in [4].
>>> 
>>> Alex
>>> 
>>> [3] http://gtech.myftp.org/~mrnuke/nvme_logs/log-20180507-1509.log
>>> [4] http://gtech.myftp.org/~mrnuke/nvme_logs/print_hacks.patch
>>> 
>>> 
>>>>> [2] http://gtech.myftp.org/~mrnuke/nvme_logs/log-20180507-1429.log
>>>>>>> [1] 
>>>>>>> http://gtech.myftp.org/~mrnuke/nvme_logs/log-20180507-1308.log
>> 



More information about the Linux-nvme mailing list