[PATCH 0/4] nvme patches for 6.3
Niklas Schnelle
schnelle at linux.ibm.com
Fri Feb 10 06:34:09 PST 2023
Hi Christoph, Hi Keith,
It looks like this series causes crashes on s390x.
With current linux-next-20230210 and a Samsung PM173X I get the
below[0] crash. Reverting patches 1-3 makes the NVMe work again. I
tried reverting just patch 3 and 2/3 but this results in crashes as
well and as far as I can see patches 2/3 depend on each other. Not
entirely sure what's going on but patch 3 mentions padding to the cache
line size and our 256 byte cache lines are definitely unusual. I didn't
see any obvious place where this would break things though. I did debug
that in the crashing nvme_unmap_data() iod->nr_allocations is -1 and
iod->use_sgl is true which is weird since it looks to me like iod-
>nr_allocations should only be -1 if sg_list couldn't be allocated from
the pool.
Best regards,
Niklas
[0]
[ 10.631468] nvme nvme0: Shutdown timeout set to 10 seconds
[ 10.639143] nvme nvme0: 63/0/0 default/read/poll queues
[ 10.646890] nvme0n1: p1
[ 10.652923] Unable to handle kernel pointer dereference in virtual kernel address space
[ 10.652927] Failing address: 0000000000000000 TEID: 0000000000000483
[ 10.652929] Fault in home space mode while using kernel ASCE.
[ 10.652932] AS:00000001ae5c8007 R3:000000027fffc007 S:000000027fffb800 P:000000000000003d
[ 10.652950] Oops: 0004 ilc:3 [#1] SMP
[ 10.652954] Modules linked in: aes_s390(E+) des_s390(E) libdes(E) sha3_512_s390(E) sha3_256_s390(E) sha512_s390(E) sha256_s390(E) nvme(E) sha1_s390(E) sha_common(E) nvme_core(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) pkey(E) zcrypt(E) rng_core(E) dm_multipath(E) autofs4(E)
[ 10.652968] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E 6.2.0-20230209.rc7.git113.20f513df926f.300.fc37.s390x+next #1
[ 10.652971] Hardware name: IBM 8561 T01 703 (LPAR)
[ 10.652973] Krnl PSW : 0404c00180000000 00000001aca225c6 (dma_pool_free+0x4e/0xe8)
[ 10.652984] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[ 10.652987] Krnl GPRS: 00000000fffffff4 0000000100000000 000000008397c490 0000000000000004
[ 10.652990] 0000000000000000 ffffffe000000000 0000000000000000 0000000095e80400
[ 10.652992] 04000001ace74d6e 0000000000000000 0000000000000000 000000008397c480
[ 10.652994] 00000000803aa100 0000000000000001 000003800081bb10 000003800081bac0
[ 10.653003] Krnl Code: 00000001aca225b6: ba13b010\x09\x09cs\x09%r1,%r3,16(%r11)
00000001aca225ba: ec160048007e\x09cij\x09%r1,0,6,00000001aca2264a
#00000001aca225c0: c00400000024\x09brcl\x090,00000001aca22608
>00000001aca225c6: e390a0080024\x09stg\x09%r9,8(%r10)
00000001aca225cc: e310b0180004\x09lg\x09%r1,24(%r11)
00000001aca225d2: e310a0000024\x09stg\x09%r1,0(%r10)
00000001aca225d8: e3a0b0180024\x09stg\x09%r10,24(%r11)
00000001aca225de: ebffb028007a\x09agsi\x0940(%r11),-1
[ 10.653039] Call Trace:
[ 10.653042] [<00000001aca225c6>] dma_pool_free+0x4e/0xe8
[ 10.653049] [<000003ff7fcd02e8>] nvme_unmap_data+0x98/0x168 [nvme]
[ 10.653054] [<000003ff7fcd043c>] nvme_pci_complete_batch+0x84/0x100 [nvme]
[ 10.653058] [<000003ff7fcd0b34>] nvme_irq+0x64/0x80 [nvme]
[ 10.653061] [<00000001ac80b1c6>] __handle_irq_event_percpu+0x5e/0x1b8
[ 10.653065] [<00000001ac80b346>] handle_irq_event_percpu+0x26/0x70
[ 10.653068] [<00000001ac8112f0>] handle_percpu_irq+0x68/0x90
[ 10.653072] [<00000001ac809c8e>] generic_handle_irq+0x3e/0x60
[ 10.653074] [<00000001ac77f794>] zpci_floating_irq_handler+0xdc/0x190
[ 10.653079] [<00000001ad2f212e>] do_airq_interrupt+0x8e/0xf0
[ 10.653085] [<00000001ac80b1c6>] __handle_irq_event_percpu+0x5e/0x1b8
[ 10.653087] [<00000001ac80b346>] handle_irq_event_percpu+0x26/0x70
[ 10.653090] [<00000001ac8112f0>] handle_percpu_irq+0x68/0x90
[ 10.653092] [<00000001ac809c8e>] generic_handle_irq+0x3e/0x60
[ 10.653094] [<00000001ac7451d8>] do_irq_async+0x50/0xb0
[ 10.653100] [<00000001ad3b5dca>] do_io_irq+0xba/0x168
[ 10.653103] [<00000001ad3c5736>] io_int_handler+0xd6/0x110
[ 10.653107] [<00000001ad3c57b6>] psw_idle_exit+0x0/0xa
[ 10.653109] ([<00000001ad3b6aac>] default_idle_call+0x3c/0x108)
[ 10.653112] [<00000001ac7ea43c>] do_idle+0xd4/0x168
[ 10.653115] [<00000001ac7ea696>] cpu_startup_entry+0x36/0x40
[ 10.653118] [<00000001ac750ac4>] smp_start_secondary+0x12c/0x138
[ 10.653121] [<00000001ad3c5a5e>] restart_int_handler+0x6e/0x90
[ 10.653123] Last Breaking-Event-Address:
[ 10.653124] [<000003ff7fccd2a4>] 0x3ff7fccd2a4
[ 10.653130] Kernel panic - not syncing: Fatal exception in interrupt
More information about the Linux-nvme
mailing list