[PATCH v2] nvme: rdma/tcp: call nvme_mpath_stop() from reconnect workqueue

Mon Apr 26 03:31:10 BST 2021

On 2021/4/25 19:34, Hannes Reinecke wrote:
> On 4/23/21 3:38 PM, mwilck at suse.com wrote:
>> From: Martin Wilck <mwilck at suse.com>
>>
>> We have observed a few crashes run_timer_softirq(), where a broken
>> timer_list struct belonging to an anatt_timer was encountered. The broken
>> structures look like this, and we see actually multiple ones attached to
>> the same timer base:
>>
>> crash> struct timer_list 0xffff92471bcfdc90
>> struct timer_list {
>>    entry = {
>>      next = 0xdead000000000122,  // LIST_POISON2
>>      pprev = 0x0
>>    },
>>    expires = 4296022933,
>>    function = 0xffffffffc06de5e0 <nvme_anatt_timeout>,
>>    flags = 20
>> }
>>
>> If such a timer is encountered in run_timer_softirq(), the kernel
>> crashes. The test scenario was an I/O load test with lots of NVMe
>> controllers, some of which were removed and re-added on the storage side.
>>
> ...
> 
> But isn't this the result of detach_timer()? IE this suspiciously looks like perfectly normal operation; is you look at expire_timers() we're first calling 'detach_timer()' before calling the timer function, ie every crash in the timer function would have this signature.
> And, incidentally, so would any timer function which does not crash.
> 
> Sorry to kill your analysis ...
> 
> This doesn't mean that the patch isn't valid (in the sense that it resolve the issue), but we definitely will need to work on root cause analysis.
The process maybe:1.ana_work add the timer;2.error recovery occurs,
in reconnecting, reinitialize the timer and call nvme_read_ana_log,
nvme_read_ana_log may add the timer again.
The same timer is added twice, crash will happens later.

Indeed ana_log_buf has the similar bug, it's been encountered in our testing.
To fix this bug, I also make the same patch and tested for more than 2 weeks.

This patch can fix the both bugs.
> 
> Cheera,
> 
> Hannes