[PATCH] PCI: ACPI: Fix ThunderX PEM initialization

David Daney ddaney at caviumnetworks.com
Tue Mar 21 07:56:05 PDT 2017


On 03/21/2017 07:17 AM, Tomasz Nowicki wrote:
> Hi Bjorn,
>
> On 21.03.2017 14:47, Bjorn Helgaas wrote:
>> On Tue, Mar 21, 2017 at 07:38:07AM -0400, Jon Masters wrote:
>>> On 03/16/2017 12:25 PM, David Daney wrote:
>>>> On 03/16/2017 07:32 AM, Jon Masters wrote:
>>>
>>>>>> Yes, it is now contains "CAVxxx" as _HID for device config object.
>>>>>
>>>>> Which is different from the version that was merged into upstream.
>>>>> That
>>>>> should never have happened. It will never happen again. I have
>>>>> spent some
>>>>> time over the past few days ensuring folks understand that I will not
>>>>> allow a repeat of this to occur the next time around. We will have
>>>>> platforms that are bulletproof and supported by upstream with any
>>>>> errata fixes in a very carefully controlled manner. There will
>>>>> under no circumstances ever be a situation like this again.
>>>>
>>>> We are still evaluating the merits of registering the values that
>>>> appeared
>>>> in v4.10, and not changing them.  We should know more in a couple of
>>>> days.
>>>
>>> Thanks David. What was the verdict? (for the public record). If we
>>> need to
>>> get a change into upstream, let's get that teed up before 4.12 merge.
>>>
>>> And for other folks following along with this thread: I'm not just
>>> picking
>>> on Cavium here. I'll be doing the same with *every* ARM server SoC
>>> company
>>> as necessary over the coming months. We are going to have militantly
>>> compliant standards adherence in this industry and every ARM server
>>> SoC is
>>> going to "just work" with an upstream Linux kernel with an ACPI enabled
>>> platform. This will be so utterly clean and boring it'll be amazing.
>>
>> Thanks for keeping on top of this, Jon.  I agree, we should not be
>> using unregistered vendor prefixes, e.g., the "THRX" added by
>> 44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host
>> controller").  I'm sorry I merged that without doing the due
>> diligence.
>
> Honestly, it is me who is responsible for this since I submitted the patch.

Yes.  After all this back and forth, Cavium has decided to deploy 
firmware with "CAVxxx" as _HID.

The deciding factor was that the prefix is already registered and there 
are probably fewer than 10 systems deployed with the experimental and 
erroneous  "THRXxxx" value.  Neither option (switching the kernel to 
"CAVxxx", or changing the firmware to use "THRXxxx") was without its 
drawbacks.  There were valid arguments on either side.  In the end 
internal momentum, among other factors, brought us to the conclusion 
that using the "CAVxxx" as _HID, and trying to get Tomasz' patches 
merged is what we will do. We fully realize that there may exists some 
combinations of the Linux kernel and Cavium firmware (both officially 
released, and experimental and unsupported) that don't result in 
functional PCIe.

I, personally, am sorry that this screw up happened in the first place. 
Let's hope that everyone learned something from the experience.

Thanks,
David Daney


>
>>
>> I suspect the resolution will be to register "THRX".  If that doesn't
>> happen, I'll propose reverting 44f22bd91e88, not because I want to
>> break things, but only because I'm not personally in a position to do
>> anything smarter.  So please propose a better solution that fits
>> within the ACPI _HID/_CID model :)
>
> I already submitted the patch to fix this. Please see:
> https://patchwork.ozlabs.org/patch/739042/
>
> Thanks,
> Tomasz




More information about the linux-arm-kernel mailing list