[PATCH v7 3/5] Documentation: Add documentation for the APM X-Gene SoC EDAC DTS binding

Rob Herring robherring2 at gmail.com
Thu Apr 30 15:36:16 PDT 2015


On Thu, Apr 30, 2015 at 4:19 PM, Loc Ho <lho at apm.com> wrote:
> Hi,
>
>>> I had read all the emails interaction. Yes I can write a single EDAC
>>> driver. But I actually have multiple instances and single instance of
>>> the same IP's. For example, I have 4 DDR controllers, 4 CPU domains,
>>> one L3 and one SoC. If you can suggest to me how this would work with
>>> devicetree, I might be able to make it works. But from HW point of
>>> view (take the memory controller as an example), there is really 4
>>> hardware blocks and some shared IP's block for status. It is much
>>> simple representation with multiple instance of the driver.
>>
>> Your whole error reporting is done roughly through the same registers,
>> as Arnd pointed out. Why can't you build your single driver around that
>> and based on the error type being reported, multiplex inside a *single*
>> interrupt handler to the proper decoding routines.
>
> Yes... I can do this but I don't see how it will scale when some SoC
> in the future may have more or less instance of the memory controllers
> for example. I guess for now, I will just iteration 4 time with an
> single base address or four resources in a loop. Let's agreed on the
> DT layout and I will generate an new patch then. See below...

You are right, it does not scale.

>>> I think we should first agreed on the DT binding and let's not worry
>>> about APEI. Then, whether we have one file or multiple file. Again,
>>> the HW consist of:
>>>
>>> 1. One top level interrupt and status registers

For these registers, are there ECC specific functions here or just
normal interrupt control/status bits (mask/unmask/status)? Assuming
the later, then you should make this block an interrupt-controller.
Then this is the interrupt-parent for the rest of the blocks.

>>> 2. CSW, MCB A and MCB B for detection of any active memory controllers
>>> 3. 4 individual memory controller instance
>>> 4. 4 CPU domain instance with its own individual L2 registers
>>> 5. Each CPU has instance own L1
>>> 6. One L3 instance
>>> 7. One SoC instance
>>
>> And all of those are separate IP blocks and they can be integrated with
>> other, non APM hardware, or do those things all belong together?
>
> The top level interrupt may be different and APM specific unless other
> vendors adapt the same bit definitions. I highly doubt other vendor
> will use the same bit definitions. The CSW is APM only. The MCB A, MCB
> B, and memory controller are APM only. The L3, and SoC are APM specify
> only. For L1 and L2, I will need to check with the CPU designer - but
> likely APM specific.
>
>> IOW,
>> can we expect such IP blocks to appear in other hw platforms or are
>> they all custom APM design which need to go together due to proprietary
>> coherency protocol or whatever, for example?
>
> See above.
>
>>
>>> As for as what is useful, we find that the memory controller and the
>>> L1 and L2 are useful as it provides info for errors - for HW/system
>>> debugging as well as margin of DDR.
>>
>> Sounds like all is useful.
>>
>
> Now let us discuss the DT layout if it is an single driver:
>
> xgeneedac: xgeneedac at 7e800000 {
>         compatible = "apm,xgene-edac";
>         regmap-efuse = <&efuse>; /* efuse */
>         reg = <0x0 0x78800000 0x0 0x100>, /* Top level interrupt
> status resource */
>                 <0x0 0x7e200000 0x0 0x1000>, /* CSW for MCB active resource
>                 <0x0 0x7e700000 0x0 0x1000>, /* MCB A resource */
>                 <0x0 0x7e720000 0x0 0x1000>, /* MCB B resource */
>                 <0x0 0x7e800000 0x0 0x1000>, /* MCU 0 resource */
>                 <0x0 0x7e840000 0x0 0x1000>, /* MCU 1 resource */
>                 <0x0 0x7e880000 0x0 0x1000>, /* MCU 2 resource */
>                 <0x0 0x7e8c0000 0x0 0x1000>, /* MCU 3 resource */
>                 <0x0 0x7c000000 0x0 0x200000>, /* CPU 0 domain for L1 and L2 */
>                 <0x0 0x7c200000 0x0 0x200000>, /* CPU 1 domain for L1 and L2 */
>                 <0x0 0x7c400000 0x0 0x200000>, /* CPU 2 domain for L1 and L2 */
>                 <0x0 0x7c600000 0x0 0x200000>, /* CPU 3 domain for L1 and L2 */
>                 <0x0 0x7e600000 0x0 0x1000>, /* L3 resource */
>                 <0x0 0x7e930000 0x0 0x1000>, /* SoC bus resource */
>                 <0x0 0x7e000000 0x0 0x1000>, /* SoC device resource */

Uggg, no. You were on the right track earlier in the thread. One node
per instance of each block.

>       interrupts = <0x0 0x20 0x4>,
>                        <0x0 0x21 0x4>,
>                        <0x0 0x27 0x4>;

Can you describe what each of these interrupts do?

Rob



More information about the linux-arm-kernel mailing list