[PATCH] Revert "IB/core: Fix use workqueue without WQ_MEM_RECLAIM"

Sagi Grimberg sagi at grimberg.me
Wed Jun 7 07:59:19 PDT 2023


>>>>>> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_rdma_reconnect_ctrl_work
>>>>>> [nvme_rdma] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
>>>>>
>>>>> And why does nvme-wq need WQ_MEM_RECLAIM flag? I wonder if it is really
>>>>> needed.
>>>>
>>>> Adding Sagi Grimberg to cc, he probably knows and can explain it better than me.
>>>
>>> We already allocate so much memory on these paths it is pretty
>>> nonsense to claim they are a reclaim context. One allocation on the WQ
>>> is not going to be the problem.
>>>
>>> Probably this nvme stuff should not be re-using a reclaim marke dWQ
>>> for memory allocating work like this, it is kind of nonsensical.
>>
>> A controller reset runs on this workqueue, which should succeed to allow
>> for pages to be flushed to the nvme disk. So I'd say its kind of
>> essential that this sequence has a rescuer thread.
> 
> So don't run the CM stuff on the same WQ, go to another one without
> the reclaim mark?

That is not trivial. teardown works won't need a rescuer, but
they will need to fence async work elements that are a part
of the bringup that do need a rescuer (for example a scan work).

So it requires more thought how it would be possible to untangle
any dependency between work elements that make use of a rescuer
and others that don't, and vice-versa.



More information about the Linux-nvme mailing list