kernel BUG at nvme/host/pci.c

Keith Busch keith.busch at intel.com
Fri Jul 14 10:08:47 PDT 2017


On Fri, Jul 14, 2017 at 06:47:43PM +0200, Andreas Pflug wrote:
> Am 13.07.17 um 15:47 schrieb Keith Busch:
> >
> >> Jul 13 10:37:37 xen2 [  202.703231] ---[ end trace 5b778353298dbe78 ]---
> >> Jul 13 10:37:37 xen2 [  202.704217] sg[0] phys_addr:0x0000000aff50ec00 offset:3072 length:9216 dma_address:0x000000000070f000 dma_length:9216
> >> Jul 13 10:37:37 xen2 [  202.705197] sg[1] phys_addr:0x0000000aff511000 offset:0 length:4096 dma_address:0x00000008755a1000 dma_length:4096
> >> Jul 13 10:37:37 xen2 [  202.706275] sg[2] phys_addr:0x0000000aff5ef000 offset:0 length:8192 dma_address:0x0000000000712000 dma_length:8192
> >> Jul 13 10:37:37 xen2 [  202.707315] sg[3] phys_addr:0x0000000aff564000 offset:0 length:4096 dma_address:0x0000000874fc0000 dma_length:4096
> >> Jul 13 10:37:37 xen2 [  202.708202] sg[4] phys_addr:0x0000000aff5a7000 offset:0 length:4096 dma_address:0x0000000874fc0000 dma_length:4096
> >> Jul 13 10:37:37 xen2 [  202.709030] sg[5] phys_addr:0x0000000aff5a6000 offset:0 length:4096 dma_address:0x0000000874fc0000 dma_length:4096
> >> Jul 13 10:37:37 xen2 [  202.709960] sg[6] phys_addr:0x0000000aff5a5000 offset:0 length:3072 dma_address:0x0000000874fc0000 dma_length:3072
> >> Jul 13 10:37:37 xen2 [  202.710755] print_req_error: I/O error, dev nvme0n1, sector 1188548943
> >> Jul 13 10:37:37 xen2 [  202.711527] md/raid1:md1: nvme0n1p1: rescheduling sector 1188284751
> > The first SGL has phys addr aff50ec00, which is a page offset of 3072,
> > but the dma addr is 70f000, which is a 0 offset. Since DMA page offset
> > doesn't match the physical address', this isn't compatible with the
> > nvme implementation.
>
> So LVM2 backed by md raid1 isn't compatible with newer hardware... Any
> suggestions?

It's not that LVM2 or RAID isn't compatible. Either the IOMMU isn't
compatible if can use different page offsets for DMA addresses than the
physical aaddresses, or the driver for it is broken. The DMA addresses
in this mapped SGL look completely broken, at least, since the last 4
entries are all the same address. That'll corrupt data.



More information about the Linux-nvme mailing list