[PATCH v6 0/9] Fix kdump faults on system with amd iommu

Baoquan He bhe at redhat.com
Thu Nov 3 22:29:13 PDT 2016


On 11/04/16 at 01:14pm, Baoquan He wrote:
> Hi Joerg,
> 
> Ping!
> 
> About the v6 post, do you have any suggestions?
> 
> Because of GCR3 special handling in patch 9/9, I spent several days to
> study the knowledge and change code. Then when I tried to post, the
> virtual interrupt remapping feature caused kernel hang with this pachset
> applied. So it took me days to study spec and find it out. Finally it's
> very late to post.
> 
> Coule it be possibe that we review and merge patch 9/1~8, and leave the
> patch 9/9 which includes GCR3 special handling as 2nd step issue? Then
> I can back port patch 9/1~8 to our distro. Since this bug has been
> discussed so long time, and currently almost all system are deployed
> with amd iommu v1 hardware. It would be great if they can be accepted
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Here I meant in our Redhat lab almost all
system are only deployed with amd iommu v1 support. 

> into 4.9 or 4.10-rc phase.
> 
> About patch 9/9, its code is a little complicated and not being
> reviewed, I am not sure if I understand your suggestion and GCR3 code
> well. What's your opinion?
> 
> Thanks
> Baoquan
> 
> 
> On 10/20/16 at 07:37pm, Baoquan He wrote:
> > This is v6 post. 
> > 
> > The principle of the fix is similar to intel iommu. Just defer the assignment
> > of device to domain to device driver init. But there's difference than
> > intel iommu. AMD iommu create protection domain and assign device to
> > domain in iommu driver init stage. So in this patchset I just allow the
> > assignment of device to domain in software level, but defer updating the
> > domain info, especially the pte_root to dev table entry to device driver
> > init stage.
> > 
> > v5: 
> >     bnx2 NIC can't reset itself during driver init. Post patch to reset
> >     it during driver init. IO_PAGE_FAULT can't be seen anymore.
> >     
> >     Below is link of v5 post.
> >     https://lists.linuxfoundation.org/pipermail/iommu/2016-September/018527.html
> > 
> > v5->v6:
> >     According to Joerg's comments made several below main changes:
> >     - Add sanity check when copy old dev tables. 
> > 
> >     - Discard the old patch 6/8.
> > 
> >     - If a device is set up with guest translations (DTE.GV=1), then don't
> >       copy that information but move the device over to an empty guest-cr3
> >       table and handle the faults in the PPR log (which just answer them
> >       with INVALID).
> > 
> > Issues need be discussed:
> >     - Joerg suggested hooking the behaviour that updates domain info into
> >       dte entry into the set_dma_mask call-back. I tried, but on my local
> >       machine with amd iommu v2, an ohci pci device doesn't call set_dma_mask.
> >       Then IO_PAGE_FAULT printing flooded.
> > 
> >       00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
> > 
> >     - About GCR3 root pointer copying issue, I don't know how to setup the
> >       test environment and haven't tested yet. Hope Joerg or Zongshun can
> >       tell what steps should be taken to test it, or help take a test in your
> >       test environemnt.
> >  
> > Baoquan He (9):
> >   iommu/amd: Detect pre enabled translation
> >   iommu/amd: add several helper function
> >   iommu/amd: Define bit fields for DTE particularly
> >   iommu/amd: Add function copy_dev_tables
> >   iommu/amd: copy old trans table from old kernel
> >   iommu/amd: Don't update domain info to dte entry at iommu init stage
> >   iommu/amd: Update domain into to dte entry during device driver init
> >   iommu/amd: Add sanity check of irq remap information of old dev table
> >     entry
> >   iommu/amd: Don't copy GCR3 table root pointer
> > 
> >  drivers/iommu/amd_iommu.c       |  93 +++++++++++++-------
> >  drivers/iommu/amd_iommu_init.c  | 189 +++++++++++++++++++++++++++++++++++++---
> >  drivers/iommu/amd_iommu_proto.h |   2 +
> >  drivers/iommu/amd_iommu_types.h |  53 ++++++++++-
> >  drivers/iommu/amd_iommu_v2.c    |  18 +++-
> >  5 files changed, 307 insertions(+), 48 deletions(-)
> > 
> > -- 
> > 2.5.5
> > 



More information about the kexec mailing list