[bug report] blktests nvme/061 hang with rdma transport and siw driver

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Tue Apr 15 19:50:28 PDT 2025


On Apr 15, 2025 / 17:00, Zhu Yanjun wrote:
> On 15.04.25 15:09, Bernard Metzler wrote:
> 
> > [  106.826346] rdma_rxe: loaded
> > [  106.832164] loop: module loaded
> > [  107.066868] run blktests nvme/061 at 2025-04-15 15:03:04
> > [  107.081270] infiniband eno1_rxe: set active
> > [  107.081274] infiniband eno1_rxe: added eno1
> > [  107.089683] infiniband enp4s0f4d1_rxe: set active
> > [  107.089687] infiniband enp4s0f4d1_rxe: added enp4s0f4d1
> > [  107.264770] loop0: detected capacity change from 0 to 2097152
> > [  107.267376] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> > [  107.271276] nvmet_rdma: enabling port 0 (10.0.0.2:4420)
> > [  107.312957] BUG: kernel NULL pointer dereference, address: 0000000000000028
> > [  107.312973] #PF: supervisor read access in kernel mode
> > [  107.312979] #PF: error_code(0x0000) - not-present page
> > [  107.312986] PGD 0 P4D 0
> > [  107.312992] Oops: Oops: 0000 [#1] SMP PTI
> > [  107.312999] CPU: 1 UID: 0 PID: 123 Comm: kworker/u32:4 Not tainted 6.15.0-rc2 #1 PREEMPT(undef)
> > [  107.313008] Hardware name: LENOVO 10A6S05601/SHARKBAY, BIOS FBKTD8AUS 09/17/2019
> > [  107.313016] Workqueue: rxe_wq do_work [rdma_rxe]
> > [  107.313030] RIP: 0010:rxe_mr_copy+0x58/0x230 [rdma_rxe]
> 
> Hi, Bernard
> 
> An interesting test. Can you find the line number of
> (rxe_mr_copy+0x58/0x230) with crash tool?
> 
> Thus we can find what variable is becoming NULL pointer.

I observe the failure too, but I also observe the recent patch [1] avoids it.
With the patch applied to the kernel v6.15-rc2, I no longer observe the failure
repeating the test case 100 times using rxe driver.

[1] https://lore.kernel.org/linux-rdma/20250402032657.1762800-1-lizhijian@fujitsu.com/


More information about the Linux-nvme mailing list