nvme-fabrics: crash at nvme connect-all
Ming Lin
mlin at kernel.org
Fri Jun 10 13:18:41 PDT 2016
On Fri, Jun 10, 2016 at 1:15 PM, Steve Wise <swise at opengridcomputing.com> wrote:
>> > I applied your patch and it does avoid the crash. So the connect to the target
>> > device via cxgb4 that I setup to fail in ib_alloc_mr(), correctly fails w/o
>> > crashing. After this connect failure, I tried to connect the same target
>> > device but via another rdma path (mlx4 instead of cxgb4 which was setup to fail)
>> > and got a different failure. Not sure if this is a regression from your fix or
>> > just another error path problem:
>> >
>> > BUG: unable to handle kernel paging request at ffff881027d00e00
>> > IP: [<ffffffffa04c5a49>] nvmf_parse_options+0x369/0x4a0 [nvme_fabrics]
>>
>> Could you find out which line of code this is?
>
> From objdump -S -l nvme-fabrics.ok, nvmf_parse_options starts at 6e0:
>
> ---
> 00000000000006e0 <nvmf_parse_options>:
> nvmf_parse_options():
> /usr/local/src/linux-2.6/drivers/nvme/host/fabrics.c:515
> { NVMF_OPT_ERR, NULL }
> };
>
> static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
> const char *buf)
> {
> 6e0: 55 push %rbp
> ----
>
> So 0x6e0+0x369 = 0xa49 which is in an inline atomic_add_return(), I think:
>
> ---
> atomic_add_return():
> /usr/local/src/linux-2.6/./arch/x86/include/asm/atomic.h:156
> *
> * Atomically adds @i to @v and returns @i + @v
> */
> static __always_inline int atomic_add_return(int i, atomic_t *v)
> {
> return i + xadd(&v->counter, i);
> a3d: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # a44 <nvmf_parse_options+0x364>
> a44: b8 01 00 00 00 mov $0x1,%eax
> a49: f0 0f c1 02 lock xadd %eax,(%rdx)
> a4d: 83 c0 01 add $0x1,%eax
> kref_get():
> /usr/local/src/linux-2.6/include/linux/kref.h:46
> {
> /* If refcount was 0 before incrementing then we have a race
> * condition when this kref is freeing by some other thread right now.
> * In this case one should use kref_get_unless_zero()
> */
> WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2);
> a50: 83 f8 01 cmp $0x1,%eax
> a53: 7e 1e jle a73 <nvmf_parse_options+0x393>
> nvmf_parse_options():
> /usr/local/src/linux-2.6/drivers/nvme/host/fabrics.c:689
> ---
Does Sagi's patch help?
Author: Sagi Grimberg <sagi at grimberg.me>
Date: Thu Jun 9 13:20:09 2016 -0700
fabrics: Don't directly free opts->host
It might be the default host, so we need to call
nvmet_put_host (which is safe against NULL lucky for
us).
Reported-by: Alexander Nezhinsky <alexander.nezhinsky at excelero.com>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 225a732..b86b637 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -805,7 +805,7 @@ nvmf_create_ctrl(struct device *dev, const char
*buf, size_t count)
out_unlock:
mutex_unlock(&nvmf_transports_mutex);
out_free_opts:
- kfree(opts->host);
+ nvmf_host_put(opts->host);
kfree(opts);
return ERR_PTR(ret);
}
More information about the Linux-nvme
mailing list