[PATCH 05/17] nvme: wire-up support for async-passthru on char-device.
Christoph Hellwig
hch at lst.de
Thu Mar 10 23:01:48 PST 2022
On Tue, Mar 08, 2022 at 08:50:53PM +0530, Kanchan Joshi wrote:
> +/*
> + * This overlays struct io_uring_cmd pdu.
> + * Expect build errors if this grows larger than that.
> + */
> +struct nvme_uring_cmd_pdu {
> + u32 meta_len;
> + union {
> + struct bio *bio;
> + struct request *req;
> + };
> + void *meta; /* kernel-resident buffer */
> + void __user *meta_buffer;
> +} __packed;
Why is this marked __packed?
In general I'd be much more happy if the meta elelements were a
io_uring-level feature handled outside the driver and typesafe in
struct io_uring_cmd, with just a normal private data pointer for the
actual user, which would remove all the crazy casting.
> +static void nvme_end_async_pt(struct request *req, blk_status_t err)
> +{
> + struct io_uring_cmd *ioucmd = req->end_io_data;
> + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd);
> + /* extract bio before reusing the same field for request */
> + struct bio *bio = pdu->bio;
> +
> + pdu->req = req;
> + req->bio = bio;
> + /* this takes care of setting up task-work */
> + io_uring_cmd_complete_in_task(ioucmd, nvme_pt_task_cb);
This is a bit silly. First we defer the actual request I/O completion
from the block layer to a different CPU or softirq and then we have
another callback here. I think it would be much more useful if we
could find a way to enhance blk_mq_complete_request so that it could
directly complet in a given task. That would also be really nice for
say normal synchronous direct I/O.
> + if (ioucmd) { /* async dispatch */
> + if (cmd->common.opcode == nvme_cmd_write ||
> + cmd->common.opcode == nvme_cmd_read) {
No we can't just check for specific commands in the passthrough handler.
> + nvme_setup_uring_cmd_data(req, ioucmd, meta, meta_buffer,
> + meta_len);
> + blk_execute_rq_nowait(req, 0, nvme_end_async_pt);
> + return 0;
> + } else {
> + /* support only read and write for now. */
> + ret = -EINVAL;
> + goto out_meta;
> + }
Pleae always handle error in the first branch and don't bother with an
else after a goto or return.
> +static int nvme_ns_async_ioctl(struct nvme_ns *ns, struct io_uring_cmd *ioucmd)
> +{
> + int ret;
> +
> + BUILD_BUG_ON(sizeof(struct nvme_uring_cmd_pdu) > sizeof(ioucmd->pdu));
> +
> + switch (ioucmd->cmd_op) {
> + case NVME_IOCTL_IO64_CMD:
> + ret = nvme_user_cmd64(ns->ctrl, ns, NULL, ioucmd);
> + break;
> + default:
> + ret = -ENOTTY;
> + }
> +
> + if (ret >= 0)
> + ret = -EIOCBQUEUED;
That's a weird way to handle the returns. Just return -EIOCBQUEUED
directly from the handler (which as said before should be split from
the ioctl handler anyway).
More information about the Linux-nvme
mailing list