[PATCH 04/47] block: provide a new BLK_EH_QUIESCED timeout return value
Jeff Moyer
jmoyer at redhat.com
Tue Nov 24 07:16:51 PST 2015
Hi Christoph,
Christoph Hellwig <hch at lst.de> writes:
> This marks the request as one that's not actually completed yet, but
> should be reaped next time blk_mq_complete_request comes in. This is
> useful it the abort handler kicked of a reset that will complete all
> pending requests.
What's the purpose, though? Is this an optimization?
We've had "fun" problems with races between completion and timeout
before. I can't say I'm too keen on adding more complexity to this code
path. Have you considered what happens in your new code when this race
occurs? I don't expect it to cause any issues in the mq case, since the
timeout handler should run on the same cpu as the completion code for a
given request (right?). However, for the old code path, they could run
in parallel.
blk_complete_request:
A if (!blk_mark_rq_complete(rq) ||
B test_and_cleart_bit(REQ_ATOM_QUIESCED, &req->atomic_flags)) {
C __blk_mq_complete_request(rq);
could run alongside of:
blk_rq_check_expired:
1 if (!blk_mark_rq_complete(rq))
2 blk_rq_timed_out(rq);
So, if 1 comes before A, we have two cases to consider:
i. the expiration path does not yet set REQ_ATOM_QUIESCED before the
completion code runs, and so the completion code does nothing.
ii. the expiration path *does* SET REQ_ATOM_QUIESCED. In this instance,
will we get yet another completion for the request when the command
is ultimately retired by the adapter reset?
Cheers,
Jeff
>
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> ---
> block/blk-mq.c | 6 +++++-
> block/blk-softirq.c | 3 ++-
> block/blk-timeout.c | 3 +++
> block/blk.h | 1 +
> include/linux/blkdev.h | 1 +
> 5 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 8354601..76773dc 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -383,7 +383,8 @@ void blk_mq_complete_request(struct request *rq, int error)
>
> if (unlikely(blk_should_fake_timeout(q)))
> return;
> - if (!blk_mark_rq_complete(rq)) {
> + if (!blk_mark_rq_complete(rq) ||
> + test_and_clear_bit(REQ_ATOM_QUIESCED, &rq->atomic_flags)) {
> rq->errors = error;
> __blk_mq_complete_request(rq);
> }
> @@ -586,6 +587,9 @@ void blk_mq_rq_timed_out(struct request *req, bool reserved)
> break;
> case BLK_EH_NOT_HANDLED:
> break;
> + case BLK_EH_QUIESCED:
> + set_bit(REQ_ATOM_QUIESCED, &req->atomic_flags);
> + break;
> default:
> printk(KERN_ERR "block: bad eh return: %d\n", ret);
> break;
> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
> index 53b1737..9d47fbc 100644
> --- a/block/blk-softirq.c
> +++ b/block/blk-softirq.c
> @@ -167,7 +167,8 @@ void blk_complete_request(struct request *req)
> {
> if (unlikely(blk_should_fake_timeout(req->q)))
> return;
> - if (!blk_mark_rq_complete(req))
> + if (!blk_mark_rq_complete(req) ||
> + test_and_clear_bit(REQ_ATOM_QUIESCED, &req->atomic_flags))
> __blk_complete_request(req);
> }
> EXPORT_SYMBOL(blk_complete_request);
> diff --git a/block/blk-timeout.c b/block/blk-timeout.c
> index aedd128..b3a7f20 100644
> --- a/block/blk-timeout.c
> +++ b/block/blk-timeout.c
> @@ -96,6 +96,9 @@ static void blk_rq_timed_out(struct request *req)
> blk_add_timer(req);
> blk_clear_rq_complete(req);
> break;
> + case BLK_EH_QUIESCED:
> + set_bit(REQ_ATOM_QUIESCED, &req->atomic_flags);
> + break;
> case BLK_EH_NOT_HANDLED:
> /*
> * LLD handles this for now but in the future
> diff --git a/block/blk.h b/block/blk.h
> index 37b9165..f4c98f8 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -120,6 +120,7 @@ void blk_account_io_done(struct request *req);
> enum rq_atomic_flags {
> REQ_ATOM_COMPLETE = 0,
> REQ_ATOM_STARTED,
> + REQ_ATOM_QUIESCED,
> };
>
> /*
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 9a8424a..5df5fb13 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -223,6 +223,7 @@ enum blk_eh_timer_return {
> BLK_EH_NOT_HANDLED,
> BLK_EH_HANDLED,
> BLK_EH_RESET_TIMER,
> + BLK_EH_QUIESCED,
> };
>
> typedef enum blk_eh_timer_return (rq_timed_out_fn)(struct request *);
More information about the Linux-nvme
mailing list