[PATCH 0/4] nvme patches for 6.3

Niklas Schnelle schnelle at linux.ibm.com
Fri Feb 10 06:34:09 PST 2023


Hi Christoph, Hi Keith,

It looks like this series causes crashes on s390x.
With current linux-next-20230210 and a Samsung PM173X I get the
below[0] crash. Reverting patches 1-3 makes the NVMe work again. I
tried reverting just patch 3 and 2/3 but this results in crashes as
well and as far as I can see patches 2/3 depend on each other. Not
entirely sure what's going on but patch 3 mentions padding to the cache
line size and our 256 byte cache lines are definitely unusual. I didn't
see any obvious place where this would break things though. I did debug
that in the crashing nvme_unmap_data() iod->nr_allocations is -1 and
iod->use_sgl is true which is weird since it looks to me like iod-
>nr_allocations should only be -1 if sg_list couldn't be allocated from
the pool.

Best regards,
Niklas

[0]  
[   10.631468] nvme nvme0: Shutdown timeout set to 10 seconds
[   10.639143] nvme nvme0: 63/0/0 default/read/poll queues
[   10.646890]  nvme0n1: p1
[   10.652923] Unable to handle kernel pointer dereference in virtual kernel address space
[   10.652927] Failing address: 0000000000000000 TEID: 0000000000000483
[   10.652929] Fault in home space mode while using kernel ASCE.
[   10.652932] AS:00000001ae5c8007 R3:000000027fffc007 S:000000027fffb800 P:000000000000003d
[   10.652950] Oops: 0004 ilc:3 [#1] SMP
[   10.652954] Modules linked in: aes_s390(E+) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha256_s390(E) nvme(E) sha1_s390(E) sha_common(E) nvme_core(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) pkey(E) zcrypt(E) rng_core(E) dm_multipath(E) autofs4(E)
[   10.652968] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E      6.2.0-20230209.rc7.git113.20f513df926f.300.fc37.s390x+next #1
[   10.652971] Hardware name: IBM 8561 T01 703 (LPAR)
[   10.652973] Krnl PSW : 0404c00180000000 00000001aca225c6 (dma_pool_free+0x4e/0xe8)
[   10.652984]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[   10.652987] Krnl GPRS: 00000000fffffff4 0000000100000000 000000008397c490 0000000000000004
[   10.652990]            0000000000000000 ffffffe000000000 0000000000000000 0000000095e80400
[   10.652992]            04000001ace74d6e 0000000000000000 0000000000000000 000000008397c480
[   10.652994]            00000000803aa100 0000000000000001 000003800081bb10 000003800081bac0
[   10.653003] Krnl Code: 00000001aca225b6: ba13b010\x09\x09cs\x09%r1,%r3,16(%r11)
                          00000001aca225ba: ec160048007e\x09cij\x09%r1,0,6,00000001aca2264a
                         #00000001aca225c0: c00400000024\x09brcl\x090,00000001aca22608
                         >00000001aca225c6: e390a0080024\x09stg\x09%r9,8(%r10)
                          00000001aca225cc: e310b0180004\x09lg\x09%r1,24(%r11)
                          00000001aca225d2: e310a0000024\x09stg\x09%r1,0(%r10)
                          00000001aca225d8: e3a0b0180024\x09stg\x09%r10,24(%r11)
                          00000001aca225de: ebffb028007a\x09agsi\x0940(%r11),-1
[   10.653039] Call Trace:
[   10.653042]  [<00000001aca225c6>] dma_pool_free+0x4e/0xe8
[   10.653049]  [<000003ff7fcd02e8>] nvme_unmap_data+0x98/0x168 [nvme]
[   10.653054]  [<000003ff7fcd043c>] nvme_pci_complete_batch+0x84/0x100 [nvme]
[   10.653058]  [<000003ff7fcd0b34>] nvme_irq+0x64/0x80 [nvme]
[   10.653061]  [<00000001ac80b1c6>] __handle_irq_event_percpu+0x5e/0x1b8
[   10.653065]  [<00000001ac80b346>] handle_irq_event_percpu+0x26/0x70
[   10.653068]  [<00000001ac8112f0>] handle_percpu_irq+0x68/0x90
[   10.653072]  [<00000001ac809c8e>] generic_handle_irq+0x3e/0x60
[   10.653074]  [<00000001ac77f794>] zpci_floating_irq_handler+0xdc/0x190
[   10.653079]  [<00000001ad2f212e>] do_airq_interrupt+0x8e/0xf0
[   10.653085]  [<00000001ac80b1c6>] __handle_irq_event_percpu+0x5e/0x1b8
[   10.653087]  [<00000001ac80b346>] handle_irq_event_percpu+0x26/0x70
[   10.653090]  [<00000001ac8112f0>] handle_percpu_irq+0x68/0x90
[   10.653092]  [<00000001ac809c8e>] generic_handle_irq+0x3e/0x60
[   10.653094]  [<00000001ac7451d8>] do_irq_async+0x50/0xb0
[   10.653100]  [<00000001ad3b5dca>] do_io_irq+0xba/0x168
[   10.653103]  [<00000001ad3c5736>] io_int_handler+0xd6/0x110
[   10.653107]  [<00000001ad3c57b6>] psw_idle_exit+0x0/0xa
[   10.653109] ([<00000001ad3b6aac>] default_idle_call+0x3c/0x108)
[   10.653112]  [<00000001ac7ea43c>] do_idle+0xd4/0x168
[   10.653115]  [<00000001ac7ea696>] cpu_startup_entry+0x36/0x40
[   10.653118]  [<00000001ac750ac4>] smp_start_secondary+0x12c/0x138
[   10.653121]  [<00000001ad3c5a5e>] restart_int_handler+0x6e/0x90
[   10.653123] Last Breaking-Event-Address:
[   10.653124]  [<000003ff7fccd2a4>] 0x3ff7fccd2a4
[   10.653130] Kernel panic - not syncing: Fatal exception in interrupt




More information about the Linux-nvme mailing list