[PATCH v4 2/2] ufs: core: requeue aborted request
Peter Wang (王信友)
peter.wang at mediatek.com
Thu Sep 19 19:02:03 PDT 2024
On Thu, 2024-09-19 at 11:49 -0700, Bart Van Assche wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On 9/19/24 5:16 AM, Peter Wang (王信友) wrote:
> > The four case flows for abort are as follows:
> > ----------------------------------------------------------------
> >
> > Case1: DBR ufshcd_abort
>
> Please follow the terminology from the UFSHCI 4.0 standard and use
> the
> word "legacy" instead of "DBR".
>
Hi Bart,
Okay, but the current code comments all use 'SDB mode'.
Should we just stick with that term?
> > In this case, you can see that ufshcd_release_scsi_cmd will
> > definitely be called.
> >
> > ufshcd_abort()
> > ufshcd_try_to_abort_task()// It should trigger an
> > interrupt, but the tensor might not
> > get outstanding_lock
> > clear outstanding_reqs tag
> > ufshcd_release_scsi_cmd()
> > release outstanding_lock
> >
> > ufshcd_intr()
> > ufshcd_sl_intr()
> > ufshcd_transfer_req_compl()
> > ufshcd_poll()
> > get outstanding_lock
> > clear outstanding_reqs tag
> > release outstanding_lock
> > __ufshcd_transfer_req_compl()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_REQUEUE// mediatek may need quirk
> > change DID_ABORT to DID_REQUEUE
> > ufshcd_release_scsi_cmd()
> > scsi_done();
> >
> > In most cases, ufshcd_intr will not reach scsi_done because the
> > outstanding_reqs tag is cleared by the original thread.
> > Therefore, whether there is an interrupt or not doesn't affect
> > the result because the ISR will do nothing in most cases.
> >
> > In a very low chance, the ISR will reach scsi_done and notify
> > SCSI to requeue, and the original thread will not
> > call ufshcd_release_scsi_cmd.
> > MediaTek may need to change DID_ABORT to DID_REQUEUE in this
> > situation, or perhaps not handle this ISR at all.
>
> Please modify ufshcd_compl_one_cqe() such that it ignores commands
> with status OCS_ABORTED. This will make the UFSHCI driver behave in
> the same way for all UFSHCI controllers, whether or not clearing a
> command triggers a completion interrupt.
>
Yes, I am considering how to modify the code here.
> > ----------------------------------------------------------------
> >
> > Case2: MCQ ufshcd_abort
> >
> > In the case of MCQ ufshcd_abort, you can also see that
> > ufshcd_release_scsi_cmd will definitely be called too.
> > However, there seems to be a problem here, as
> > ufshcd_release_scsi_cmd might be called twice.
> > This is because cmd is not null in ufshcd_release_scsi_cmd,
> > which the previous version would set cmd to null.
> > Skipping OCS: ABORTED in ufshcd_compl_one_cqe indeed
> > can avoid this problem. This part needs further
> > consideration on how to handle it.
> >
> > ufshcd_abort()
> > ufshcd_mcq_abort()
> > ufshcd_try_to_abort_task()// will trigger ISR
> > ufshcd_release_scsi_cmd()
> >
> > ufs_mtk_mcq_intr()
> > ufshcd_mcq_poll_cqe_lock()
> > ufshcd_mcq_process_cqe()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_ABORT
> > ufshcd_release_scsi_cmd() // will release twice
> > scsi_done()
>
> Do you agree that this case can be addressed with the
> ufshcd_compl_one_cqe() change proposed above?
>
Agree.
> > ----------------------------------------------------------------
> >
> > Case3: DBR ufshcd_err_handler
> >
> > In the case of the DBR mode error handler, it's the same;
> > ufshcd_release_scsi_cmd will also be executed, and scsi_done
> > will definitely be used to notify SCSI to requeue.
> >
> > ufshcd_err_handler()
> > ufshcd_abort_all()
> > ufshcd_abort_one()
> > ufshcd_try_to_abort_task()// It should trigger an
> > interrupt, but the tensor might not
> > ufshcd_complete_requests()
> > ufshcd_transfer_req_compl()
> > ufshcd_poll()
> > get outstanding_lock
> > clear outstanding_reqs tag
> > release outstanding_lock
> > __ufshcd_transfer_req_compl()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_REQUEUE // mediatek may need quirk
> > change DID_ABORT to DID_REQUEUE
> > ufshcd_release_scsi_cmd()
> > scsi_done()
> >
> > ufshcd_intr()
> > ufshcd_sl_intr()
> > ufshcd_transfer_req_compl()
> > ufshcd_poll()
> > get outstanding_lock
> > clear outstanding_reqs tag
> > release outstanding_lock
> > __ufshcd_transfer_req_compl()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_REQUEUE // mediatek may need quirk
> change
> > DID_ABORT to DID_REQUEUE
> > ufshcd_release_scsi_cmd()
> > scsi_done();
> >
> > At this time, the same actions are taken regardless of whether
> > there is an ISR, and with the protection of outstanding_lock,
> > only one thread will execute ufshcd_release_scsi_cmd and scsi_done.
> > ----------------------------------------------------------------
> >
> > Case4: MCQ ufshcd_err_handler
> >
> > It's the same with MCQ mode; there is protection from the cqe lock,
> > so only one thread will execute. What my patch 2 aims to do is to
> > change DID_ABORT to DID_REQUEUE in this situation.
> >
> > ufshcd_err_handler()
> > ufshcd_abort_all()
> > ufshcd_abort_one()
> > ufshcd_try_to_abort_task()// will trigger irq thread
> > ufshcd_complete_requests()
> > ufshcd_mcq_compl_pending_transfer()
> > ufshcd_mcq_poll_cqe_lock()
> > ufshcd_mcq_process_cqe()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_ABORT // should change to
> DID_REQUEUE
> > ufshcd_release_scsi_cmd()
> > scsi_done()
> >
> > ufs_mtk_mcq_intr()
> > ufshcd_mcq_poll_cqe_lock()
> > ufshcd_mcq_process_cqe()
> > ufshcd_compl_one_cqe()
> > cmd->result = DID_ABORT // should change to DID_REQUEUE
> > ufshcd_release_scsi_cmd()
> > scsi_done()
>
> For legacy and MCQ mode, I prefer the following behavior for
> ufshcd_abort_all():
> * ufshcd_compl_one_cqe() ignores commands with status OCS_ABORTED.
> * ufshcd_release_scsi_cmd() is called either by ufshcd_abort_one() or
> by ufshcd_abort_all().
>
> Do you agree with making the changes proposed above?
>
> Thank you,
This might not work, as SDB mode doesn't ignore
OCS: INVALID_OCS_VALUE but rather notifies SCSI to requeue.
So what we need to correct is to notify SCSI to requeue
when MCQ mode receives OCS: ABORTED as well.
Furthermore, ufshcd_compl_one_cqe, whether it comes from
ufshcd_abort_all or ISR, does the same thing and is protected
by a lock. Therefore, there is no need for special handling
specifically for ufshcd_abort_all.
After discussing with you, I realized that there are indeed many
deficiencies and inconsistencies here that need to be addressed.
I will submit a new patch for the content discussed above.
Thanks.
Peter
>
> Bar
More information about the Linux-mediatek
mailing list