[LSF/MM?BFP TOPIC] Block-layer device resets
Hannes Reinecke
hare at suse.de
Mon Feb 2 15:04:16 PST 2026
On 2/2/26 02:46, Damien Le Moal wrote:
> On 2/2/26 02:06, Hannes Reinecke wrote:
>> Hi all,
>>
>> We are currently working on implementing cross-controller resets for
>> NVMe, which requires to send a command to the target which then should
>> terminate all commands on a given controller.
>> While we could easily terminate the controller, the specification
>> also requires us to terminate all outstanding commands.
>> Which then recurses into my all-time favourite topic on how to
>> abort outstanding commands from the fs/bio layer.
>>
>> However, here we don't have to dissect/match to individual commands,
>> but rather have to abort everything, which seems rather easier.s
>>
>> So I would like to fathom whether such a thing is feasible/reasonable
>> (I think so, obviously, and can think of several other use-cases, too,
>> qemu springs to mind here ...) and discuss possible implementations
>> (set 'req->deadline' to zero for all pending commands?).
>> Or maybe we can do such a thing already and I'm just not aware of it...
>
> Hmmm... Command timeouts ? E.g. if a controller is slow to respond (send
> completions), the block layer timeout timer may trigger, which will call into
> the low level device driver to force a reset. But before the reset actually
> happens, completions may actually come back, and we do handle that race
> correctly, well at least for scsi/ata.
>
> Your scenario sound very similar to this: once you reset the controller,
> whatever was pending will be silent and can be aborted or retried. So it does
> sound like that should not be too difficult, no ? Generalize the timeout
> processing or do something similar ?
>
The good thing is we don't even need to generalize anything. It should
should be sufficient to walk the inflight requests and set
'rq->deadline' to 'jiffies'. General idea here is to just _initiate_
command termination with this, one then still has to wait for all
commands to complete, but at least now there is a reasonable chance
that this will happen quickly.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list