[PATCH] intel-iommu: Synchronize gcmd value with global command register
Takao Indoh
indou.takao at jp.fujitsu.com
Mon Apr 8 04:57:07 EDT 2013
(2013/04/04 23:24), David Woodhouse wrote:
> On Thu, 2013-04-04 at 14:48 +0900, Takao Indoh wrote:
>>
>> - DMAR fault messages floods and second kernel does not boot. Recently I
>> saw similar report. https://lkml.org/lkml/2013/3/8/120
>
> Right. So the fix for that is to make the subsequent errors silent,
> until/unless we actually get a request to create a mapping for the given
> device.
>
>> - igb driver detectes error on linkup and kdump via network fails.
>
> That's a driver bug, IIRC. It was failing to completely reset the
> hardware. It's fixed now, isn't it?
No, it can be reproduced with latest kernel(3.9.0-rc6).
>
>> - On a certain platform, though kdump itself works, PCIe error like
>> Unexpected Completion is detected and it gets hardware degraded.
>
> More information required.
When I tested intel_iommu on a certain machine, the following error
message was logged in its firmware, and I/O board got abnormal status.
05:00.0 is igb, so I think this was caused by DMA error on igb. This
occurs before igb driver loading, so this cannot be fixed in driver.
PCI: Unexpected Completion Bus: 5 Device: 0x00 Function: 0x00
Anyway, I'm thinking we should introduce something framework to clean
all devices to stop DMA at boot time rather than dealing with the
problem in each driver. And one of the way I found is resetting devcies
by PCIe layer. If DMAR is disabled in init_dmars(), we can have a
chance to handle devices to stop DMA in PCI layer, like qci-quirk. This
is one of the reason why I propose this patch.
Thanks,
Takao Indoh
More information about the kexec
mailing list