nvme-tcp bricks my computer
Sagi Grimberg
sagi at grimberg.me
Mon Feb 15 16:35:37 EST 2021
> Hi Sagi,
Hey,
> Just to give you an update...
>
> We're still investigating the root cause of the crash.
>
> We found a bug in our Discovery Controller related to SGL format (format
> 0x5A vs. 0x01). When the host sends a "Set Feature" to configure AER/AEN
> with a SGL format of 0x5A,
This is coming from:
--
static void nvme_tcp_set_sg_null(struct nvme_command *c)
{
struct nvme_sgl_desc *sg = &c->common.dptr.sgl;
sg->addr = 0;
sg->length = 0;
sg->type = (NVME_TRANSPORT_SGL_DATA_DESC << 4) |
NVME_SGL_FMT_TRANSPORT_A;
}
--
the DC responds with an R2T, which is
> obviously a bug. This does not happen when the SGL format is 0x01. We
> believe that this R2T, because it is unexpected by the nvme-tcp module,
> causes the module to crash.
I'm assuming because the R2T has data length of 0? because set_features
does not pass any data (feature offset/value is in the sqe)...
> One of our engineers that is more familiar with kernel modules is
> currently trying to understand how the R2T would cause nvme-tcp to crash.
>
> I will let you know if/when I get more info.
Cool, thanks.
More information about the Linux-nvme
mailing list