[bug report] blktests nvme/061 hang with rdma transport and siw driver

Zhu Yanjun yanjun.zhu at linux.dev
Tue Apr 15 22:14:18 PDT 2025


在 2025/4/16 4:50, Shinichiro Kawasaki 写道:
> On Apr 15, 2025 / 17:00, Zhu Yanjun wrote:
>> On 15.04.25 15:09, Bernard Metzler wrote:
>>
>>> [  106.826346] rdma_rxe: loaded
>>> [  106.832164] loop: module loaded
>>> [  107.066868] run blktests nvme/061 at 2025-04-15 15:03:04
>>> [  107.081270] infiniband eno1_rxe: set active
>>> [  107.081274] infiniband eno1_rxe: added eno1
>>> [  107.089683] infiniband enp4s0f4d1_rxe: set active
>>> [  107.089687] infiniband enp4s0f4d1_rxe: added enp4s0f4d1
>>> [  107.264770] loop0: detected capacity change from 0 to 2097152
>>> [  107.267376] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
>>> [  107.271276] nvmet_rdma: enabling port 0 (10.0.0.2:4420)
>>> [  107.312957] BUG: kernel NULL pointer dereference, address: 0000000000000028
>>> [  107.312973] #PF: supervisor read access in kernel mode
>>> [  107.312979] #PF: error_code(0x0000) - not-present page
>>> [  107.312986] PGD 0 P4D 0
>>> [  107.312992] Oops: Oops: 0000 [#1] SMP PTI
>>> [  107.312999] CPU: 1 UID: 0 PID: 123 Comm: kworker/u32:4 Not tainted 6.15.0-rc2 #1 PREEMPT(undef)
>>> [  107.313008] Hardware name: LENOVO 10A6S05601/SHARKBAY, BIOS FBKTD8AUS 09/17/2019
>>> [  107.313016] Workqueue: rxe_wq do_work [rdma_rxe]
>>> [  107.313030] RIP: 0010:rxe_mr_copy+0x58/0x230 [rdma_rxe]
>>
>> Hi, Bernard
>>
>> An interesting test. Can you find the line number of
>> (rxe_mr_copy+0x58/0x230) with crash tool?
>>
>> Thus we can find what variable is becoming NULL pointer.
> 
> I observe the failure too, but I also observe the recent patch [1] avoids it.
> With the patch applied to the kernel v6.15-rc2, I no longer observe the failure
> repeating the test case 100 times using rxe driver.
> 
> [1] https://lore.kernel.org/linux-rdma/20250402032657.1762800-1-lizhijian@fujitsu.com/

Hi, Shinichiro

Your confirmation is important for us.
Thanks a lot. I am very glad that the above commit can fix the 
aforementioned problem.

Best Regards,
Zhu Yanjun





More information about the Linux-nvme mailing list