[PATCH 17/17] nvme: enable non-inline passthru commands

Kanchan Joshi joshiiitr at gmail.com
Thu Mar 10 03:50:13 PST 2022


On Thu, Mar 10, 2022 at 2:06 PM Christoph Hellwig <hch at lst.de> wrote:
>
> On Tue, Mar 08, 2022 at 08:51:05PM +0530, Kanchan Joshi wrote:
> > From: Anuj Gupta <anuj20.g at samsung.com>
> >
> > On submission,just fetch the commmand from userspace pointer and reuse
> > everything else. On completion, update the result field inside the
> > passthru command.
>
> What is that supposed to mean?  What is the reason to do it.  Remember
> to always document the why in commit logs.

I covered some of it in patch 6, but yes I need to expand the
reasoning here. Felt that retrospectively too.
So there are two ways/modes of submitting commands:

Mode 1: inline into sqe. This is the default way when passthru command
is placed inside a big sqe which has 80 bytes of space.
The only problem is - passthru command has this 'result' field
(structure below for quick reference) which is statically embedded and
not a pointer (like addr and metadata field).

struct nvme_passthru_cmd64 {
__u8 opcode;
__u8 flags;
__u16 rsvd1;
__u32 nsid;
__u32 cdw2;
__u32 cdw3;
__u64 metadata;
__u64 addr;
__u32 metadata_len;
__u32 data_len;
__u32 cdw10;
__u32 cdw11;
__u32 cdw12;
__u32 cdw13;
__u32 cdw14;
__u32 cdw15;
__u32 timeout_ms;
__u32   rsvd2;
__u64 result;
};
In sync ioctl, we always update this result field by doing put_user on
completion.
For async ioctl, since command is inside the the sqe, its lifetime is
only upto submission. SQE may get reused post submission, leaving no
way to update the "result" field on completion. Had this field been a
pointer, we could have saved this on submission and updated on
completion. But that would require redesigning this structure and
adding newer ioctl in nvme.

Coming back, even though sync-ioctl alway updates this result to
user-space, only a few nvme io commands (e.g. zone-append, copy,
zone-mgmt-send) can return this additional result (spec-wise).
Therefore in nvme, when we are dealing with inline-sqe commands from
io_uring, we never attempt to update the result. And since we don't
update the result, we limit support to only read/write passthru
commands. And fail any other command during submission itself (Patch
2).

Mode 2: Non-inline/indirect (pointer of command into sqe) submission.
User-space places a pointer of passthru command, and a flag in
io_uring saying that this is not inline.
For this, in nvme (this patch) we always update the 'result' on
completion and therefore can support all passthru commands.

Hope this makes the reasoning clear?
Plumbing wise, non-inline support does not create churn (almost all
the infra of inline-command handling is reused). Extra is
copy_from_user , and put_user.

> > +static inline bool is_inline_rw(struct io_uring_cmd *ioucmd, struct nvme_command *cmd)
>
> Overly long line.

Under 100, but sure, can fold it under 80.


-- 
Kanchan



More information about the Linux-nvme mailing list