[PATCH v12 10/26] nvme-tcp: Deal with netdevice DOWN events

Sun Aug 20 03:50:47 PDT 2023

>>>>> +     switch (event) {
>>>>> +     case NETDEV_GOING_DOWN:
>>>>> +             mutex_lock(&nvme_tcp_ctrl_mutex);
>>>>> +             list_for_each_entry(ctrl, &nvme_tcp_ctrl_list, list) {
>>>>> +                     if (ndev == ctrl->offloading_netdev)
>>>>> +                             nvme_tcp_error_recovery(&ctrl->ctrl);
>>>>> +             }
>>>>> +             mutex_unlock(&nvme_tcp_ctrl_mutex);
>>>>> +             flush_workqueue(nvme_reset_wq);
>>>>
>>>> In what context is this called? because every time we flush a workqueue,
>>>> lockdep finds another reason to complain about something...
>>>
>>> Thanks for highlighting this, we re-checked it and we found that we are
>>> covered by nvme_tcp_error_recovery(), we can remove the
>>> flush_workqueue() call above.
>>
>> Don't you need to flush at least err_work? How do you know that it
>> completed and put all the references?
> 
> Our bad, we do need to wait for the netdev reference to be put, and we
> must keep the flush_workqueue().
> 
> We did test with lockdep but did not notice any warnings.

I'm assuming you are running with lockdep and friends?

> 
> As for the context of the event handler when you set the link down is
> the process issuing the netlink syscall.
> 
> So if you run "ip link set X down" it would be (simplified):
> 
> "ip" -> syscall -> netlink api -> ... -> do_setlink -> call_netdevice_notifiers_info.

ok, that should be fine I think.