[PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

Can Guo cang at codeaurora.org
Fri Jul 31 04:00:28 EDT 2020


Hi Bart,

On 2020-07-31 12:06, Bart Van Assche wrote:
> On 2020-07-30 18:30, Stanley Chu wrote:
>> On Mon, 2020-07-27 at 11:18 +0000, Avri Altman wrote:
>>> Looks good to me.
>>> But better wait and see if Bart have any further reservations.
>> 
>> Would you have any further suggestions?
> 
> Today is the first time that I took a look at ufshcd_abort(). The
> approach of that function looks wrong to me. This is how I think that a
> SCSI LLD abort handler should work:
> (1) Serialize against the completion path
> (__ufshcd_transfer_req_compl()) such that it cannot happen that the
> abort handler and the regular completion path both call
> cmd->scsi_done(cmd) at the same time. I'm not sure whether an existing
> synchronization object can be used for this purpose or whether a new
> synchronization object has to be introduced to serialize scsi_done()
> calls from __ufshcd_transfer_req_compl() and ufshcd_abort().
> (2) While holding that synchronization object, check whether the SCSI
> command is still outstanding. If so, submit a SCSI abort TMR to the 
> device.
> (3) If the command has been aborted, call scsi_done() and return
> SUCCESS. If aborting failed and the command is still in progress, 
> return
> FAILED.
> 
> An example is available in srp_abort() in
> drivers/infiniband/ulp/srp/ib_srp.c.
> 
> Bart.


AFAIK, sychronization of scsi_done is not a problem here, because scsi 
layer
use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to 
prevent
the concurrency of abort and real completion of it.

Check func scsi_times_out(), hope it helps.

enum blk_eh_timer_return scsi_times_out(struct request *req)
{
...
         if (rtn == BLK_EH_DONE) {
                 /*
                  * Set the command to complete first in order to prevent 
a real
                  * completion from releasing the command while error 
handling
                  * is using it. If the command was already completed, 
then the
                  * lower level driver beat the timeout handler, and it 
is safe
                  * to return without escalating error recovery.
                  *
                  * If timeout handling lost the race to a real 
completion, the
                  * block layer may ignore that due to a fake timeout 
injection,
                  * so return RESET_TIMER to allow error handling another 
shot
                  * at this command.
                  */
                 if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
                         return BLK_EH_RESET_TIMER;
                 if (scsi_abort_command(scmd) != SUCCESS) {
                         set_host_byte(scmd, DID_TIME_OUT);
                         scsi_eh_scmd_add(scmd);
                 }
         }
}

Thanks,

Can Guo.



More information about the linux-arm-kernel mailing list