[PATCH v10 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

Dave Young dyoung at redhat.com
Fri Apr 24 01:49:57 PDT 2015


On 04/24/15 at 04:35pm, Baoquan He wrote:
> On 04/24/15 at 04:25pm, Dave Young wrote:
> > Hi, Baoquan
> > 
> > > I support this patchset.
> > > 
> > > We should not fear oldmem since reserved crashkernel region is similar.
> > > No one can guarantee that any crazy code won't step into crashkernel
> > > region just because 1st kernel says it's reversed for kdump kernel. Here
> > > the root table and context tables are also not built to allow legal code
> > > to danamge. Both of them has the risk to be corrupted, for trying our
> > > best to get a dumped vmcore the risk is worth being taken.
> > 
> > old mem is mapped in 1st kernel so compare with the reserved crashkernel
> > they are more likely to be corrupted. they are totally different. 
> 
> Could you tell how and why they are different? Wrong code will choose
> root tables and context tables to danamge when they totally lose
> control?

iommu will map io address to system ram, right? not to reserved ram, but
yes I'm assuming the page table is right, but I was worrying they are corrupted
while kernel panic is happening.

> 
> > 
> > > 
> > > And the resetting pci way has been NACKed by David Woodhouse, the
> > > maintainer of intel iommu. Because the place calling the resetting pci
> > > code is ugly before kdump kernel or in kdump kernel. And as he said a
> > > certain device made mistakes why we blame on all devices. We should fix
> > > that device who made mistakes. 
> > 
> > Resetting pci bus is not ugly than fixing a problem with risk and to fix
> > the problem it introduced in the future.
> 
> There's a problem, we fix the problem. If that's uglier, I need redefine
> the 'ugly' in my personal dict. You mean the problem it could introduce
> is wrong code will damage root table and context tables, why don't we
> fix that wrong code, but blame innocent context tables? So you mean
> these tables should deserve being damaged by wrong code?

I'm more than happy to see this issue can be fixed in the patchset, I do not
agree to add the code there with such problems. OTOH, for now seems there's
no way to fix it.

> 
> > 
> > I know it is late to speak out, but sorry I still object and have to NACK this
> > oldmem approach from my point.
> > 
> > Thanks
> > Dave



More information about the kexec mailing list