[PATCH v3] nvme-rdma: handle nvme completion data length

Max Gurtovoy mgurtovoy at nvidia.com
Wed Oct 28 12:58:08 EDT 2020


On 10/25/2020 1:51 PM, zhenwei pi wrote:
> Hit a kernel warning:
> refcount_t: underflow; use-after-free.
> WARNING: CPU: 0 PID: 0 at lib/refcount.c:28
>
> RIP: 0010:refcount_warn_saturate+0xd9/0xe0
> Call Trace:
>   <IRQ>
>   nvme_rdma_recv_done+0xf3/0x280 [nvme_rdma]
>   __ib_process_cq+0x76/0x150 [ib_core]
>   ...
>
> The reason is that a zero bytes message received from target, and the
> host side continues to process without length checking, then the
> previous CQE is processed twice.
>
> Do sanity check on received data length, try to recovery for corrupted
> CQE case.
>
> Because zero bytes message in not defined in spec, using zero bytes
> message to detect dead connections on transport layer is not
> standard, currently still treat it as illegal.
>
> Thanks to Chao Leng & Sagi for suggestions.
>
> Signed-off-by: zhenwei pi <pizhenwei at bytedance.com>
> ---
>   drivers/nvme/host/rdma.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
Seems strange that the targets sends zero byte packets.

Can you specify which target is this and the scenario ?




More information about the Linux-nvme mailing list