[PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel
Don Dutile
ddutile at redhat.com
Tue Apr 7 07:12:43 PDT 2015
On 04/06/2015 11:46 PM, Dave Young wrote:
> On 04/05/15 at 09:54am, Baoquan He wrote:
>> On 04/03/15 at 05:21pm, Dave Young wrote:
>>> On 04/03/15 at 05:01pm, Li, ZhenHua wrote:
>>>> Hi Dave,
>>>>
>>>> There may be some possibilities that the old iommu data is corrupted by
>>>> some other modules. Currently we do not have a better solution for the
>>>> dmar faults.
>>>>
>>>> But I think when this happens, we need to fix the module that corrupted
>>>> the old iommu data. I once met a similar problem in normal kernel, the
>>>> queue used by the qi_* functions was written again by another module.
>>>> The fix was in that module, not in iommu module.
>>>
>>> It is too late, there will be no chance to save vmcore then.
>>>
>>> Also if it is possible to continue corrupt other area of oldmem because
>>> of using old iommu tables then it will cause more problems.
>>>
>>> So I think the tables at least need some verifycation before being used.
>>>
>>
>> Yes, it's a good thinking anout this and verification is also an
>> interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
>> and then verify this again when panic happens in purgatory. This checks
>> whether any code stomps into region reserved for kexec/kernel and corrupt
>> the loaded kernel.
>>
>> If this is decided to do it should be an enhancement to current
>> patchset but not a approach change. Since this patchset is going very
>> close to point as maintainers expected maybe this can be merged firstly,
>> then think about enhancement. After all without this patchset vt-d often
>> raised error message, hung.
>
> It does not convince me, we should do it right at the beginning instead of
> introduce something wrong.
>
> I wonder why the old dma can not be remap to a specific page in kdump kernel
> so that it will not corrupt more memory. But I may missed something, I will
> looking for old threads and catch up.
>
> Thanks
> Dave
>
The (only) issue is not corruption, but once the iommu is re-configured, the old,
not-stopped-yet, dma engines will use iova's that will generate dmar faults, which
will be enabled when the iommu is re-configured (even to a single/simple paging scheme)
in the kexec kernel.
More information about the kexec
mailing list