Need some pointers to debug a KASAN splat in NVMe over Fabrics with rdma-rxe

Johannes Thumshirn jthumshirn at suse.de
Wed Mar 8 07:35:50 PST 2017


Hi Moni et al.,

I'm getting a KASAN stack-out-of-bounds in rxe_post_send+0xdfe/0x1830
[rdma_rxe] at addr ffff8800187072e8 with v4.11-rc1

rxe_post_send+0xdfe is the following (note: the pr_err was inserted by
me to aid debugging).

(gdb) list *(rxe_post_send+0xdfe)
0x1dc3e is in rxe_post_send (drivers/infiniband/sw/rxe/rxe_verbs.c:765).
760             pr_err("%s: *_wr(ibwr): %p\n",
761                    __func__, (void *)(mask & WR_ATOMIC_MASK ?
atomic_wr(ibwr)
762                    : rdma_wr(ibwr)));
763
764             wqe->iova               = (mask & WR_ATOMIC_MASK) ?
765
atomic_wr(ibwr)->remote_addr :
766                                             rdma_wr(ibwr)->remote_addr;
767             wqe->mask               = mask;
768             wqe->dma.length         = length;
769             wqe->dma.resid          = length;

Coincidentially ffff8800187072e8 = ibwr + 0x28. ibwr comes from
nvme_rdma_post_send() and has an opcode of IB_WR_SEND (verified . So the
rdma_wr(ibwr) call cannot return a correct/valid parent object (neither
could the atomic_wr(ibr)).

So much for the easy/mechanic part.

I can special case IB_WR_SEND in rxe's init_send_wqe() but I neither
know if it is correct nor how the wqe elements (especially wqe->iova)
should be set up.

So any help would be appreciated here.

Thanks in advance,
	Johannes
-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850



More information about the Linux-nvme mailing list