[PATCH 1/3] coresight etm4x: Early clear TRCOSLAR.OSLK prior to TRCIDR1 read

Suzuki K Poulose suzuki.poulose at arm.com
Mon Jan 23 14:49:00 PST 2023


On 23/01/2023 19:47, Steve Clevenger wrote:
> 
> 
> On 1/23/2023 2:54 AM, Mike Leach wrote:
>> Hi Steve,
>>
>> On Sat, 21 Jan 2023 at 07:31, Steve Clevenger
>> <scclevenger at os.amperecomputing.com> wrote:
>>>
>>>
>>> Hi Mike,
>>>
>>> Comments in-line.
>>>
>>> Steve
>>>
>>> On 1/20/2023 3:45 AM, Mike Leach wrote:
>>>> Hi Steve,
>>>>
>>>> On Fri, 20 Jan 2023 at 00:52, Steve Clevenger
>>>> <scclevenger at os.amperecomputing.com> wrote:
>>>>>
>>>>> Add Ampere early clear of ETM TRCOSLAR.OSLK prior to TRCIDR1 access.
>>>>> Ampere Computing erratum AC03_DEBUG_06 describes an Ampere
>>>>> Computing design decision MMIO reads are considered the same as an
>>>>> external debug access. If TRCOSLAR.OSLK is set, the TRCIDR1 access
>>>>> results in a bus fault followed by a kernel panic.  A TRCIDR1 read
>>>>> is valid regardless of TRCOSLAR.OSLK provided MMIO access
>>>>> (now deprecated) is supported.
>>>>> AC03_DEBUG_06 is described in the AmpereOne Developer Errata:
>>>>> https://solutions.amperecomputing.com/customer-connect/products/AmpereOne-device-documentation
>>>>>
>>>>> Add Ampere ETM PID required for Coresight ETM driver support.
>>>>>
>>>>> Signed-off-by: Steve Clevenger <scclevenger at os.amperecomputing.com>
>>>>> ---
>>>>>   .../coresight/coresight-etm4x-core.c          | 36 +++++++++++++++----
>>>>>   drivers/hwtracing/coresight/coresight-etm4x.h |  2 ++
>>>>>   2 files changed, 32 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> index 1cc052979e01..533be1928a09 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>>> @@ -1091,19 +1091,34 @@ static void etm4_init_arch_data(void *info)
>>>>>          drvdata = dev_get_drvdata(init_arg->dev);
>>>>>          csa = init_arg->csa;
>>>>>
>>>>
>>>> As far as I can tell there appears to be an initialisation issue here.
>>>> etm_probe()
>>>> ...
>>>> struct csdev_access access = { 0 };
>>>> ...
>>>> init_arg.csa = &access
>>>>
>>>> ::call=> etm4_init_arch_data(init_arg)
>>>>
>>>> Thus csa is uninitialised?
>>> It looks to me csa is intended to be initialized to zero? In any case,
>>> the Ampere check uses only the ETM pid, which is initialized directly above.
>>>
>>
>> Sorry, I should have been more explicit.
>>
>> csa is the addressing abstraction used by all the underlying register
>> read/write code.
>>
>> It is initialised to {0} in the calling code, probably to avoid the
>> kernel tests complaining about uninitialised use of a variable.
>>
>> However in the etm4_init_csdev_access() function we are using a base
>> address then it is initialised to:-
>>
>> struct csdev_access {
>>      io_mem = true;
>>      *base = io_mem_base_addr;
>> };
>>
>> and in the access using system registers for an etm4 to:
>>
>> struct csdev_access {
>>      io_mem = false;
>>      *read = etm4x_sysreg_read()
>>      *write = etm4x_sysreg_write()
>> };
> Yes, csa is initialized indirectly in etm4_init_csdev_access via calls
> to etm4_init_iomem_access and etm4_init_sysreg_access. So, you are
> correct. csa is zero initialized for the call to etm4_detect_os_lock
> (TRCOSLSR read) prior to my patch which clears TRCOSLAR.OSLK. The
> TRCOSLSR read and TRCOSLAR.OSLK clear default to sysreg access. The csa
> initialization problem is an existing bug I didn't see.
>>
>> Thus all underlying register access can use the correct method for the device.
>>
>>>>
>>>>> +       /* Detect the support for OS Lock before we actually use it */
>>>>> +       etm_detect_os_lock(drvdata, csa);
>>>>> +
>>
>> Thus passing a 0 init csa object to the etm_detect_os_lock()  fn above
>> seems to be suspicious.
>>
>>>>> +       /*
>>>>> +        * For ETM implementations that consider MMIO an external access
>>>>> +        * clear TRCOSLAR.OSLK early.
>>>>> +        */
>>>>> +       if (drvdata->mmio_external)
>>>>> +               etm4_os_unlock_csa(drvdata, csa);
>>>>> +
>>>>>          /*
>>>>>           * If we are unable to detect the access mechanism,
>>>>>           * or unable to detect the trace unit type, fail
>>>>> -        * early.
>>>>> +        * early. Reset TRCOSLAR.OSLK if cleared.
>>>>>           */
>>>>> -       if (!etm4_init_csdev_access(drvdata, csa))
>>>>> +       if (!etm4_init_csdev_access(drvdata, csa)) {
>>>>
>>>> This call initialises csa according to sysreg / iomem access requirements
>>
>>> csa is initialized only when no drvdata->base exists.
>>
>> Not so - csa is initialised in both circumstances as described above.
>>
>>> Under what
>>> circumstance would there be no ETM base given the recommended CoreSight
>>> ACPI implementation? See the examples in ARM Document number: DEN0067.
>>
>>
>> This will be used in the ETE devices (which share the etm4 driver), or
>> any ETM4.6+ that uses the "arm,coresight-etm4x-sysreg" device tree
>> binding (not sure what the ACPI equivalent is).
>>
>> So, either way, you need an init csa, before passing it to the driver calls.
> Agreed. See my summary above.
> 
>>
>> Later in the initialisation sequence we generate a coresight_device
>> object which the csa is bound to, and finally if all is well the
>> coresight_device is bound to drvdata at which point the device is
>> ready for use.
>>
>> It is unfortunate, but to handle the two methods of register access,
>> the initilialisation process for the driver has become more
>> complicated with ordering dependencies - to ensure that the rest of
>> the driver remains simpler when accessing device registers.
>>
>> As Suzuki mentioned - moving this specific lock requirement into the
>> _init function would be clearer and ensure that the initialisation
>> sequences were observed.
> Getting back to the etm4_init_csdev_access implementation, if a
> drvdata->base exists, etm4_init_sysreg_access is never called. The
> driver doesn't initialize sysreg access if a memory map exists by
> design. The existing Coresight ACPI specification only has the option of
> describing the manufacturer trace implementation using descriptions of
> memory mapped components.

As mentioned in my previous comments, this is only a matter of
updating the spec. You should be able to use everything same for
the ETM, excluding the "MMIO" region.

i.e,

      Device(CPU0) {
         Name(_HID, "ACPI0010"),
         ...
         Device(ETM0) {
           Name (_HID, "ARMHC500")      // ETM
           Name (_CID, "ARMHC500")      // ETM

           /* No _CRS Memory Resource  for sysreg access */
	  /* Graph Connections as usual, if any ATB is connected */
	  Name (_DSD , Package () {
	  ...
	  })
           ...
         } // ETM0
     } // CPU0


Suzuki

> 
>>
>> Regards
>>
>> Mike
>>
>>>>
>>>>
>>>>
>>>>> +               if (drvdata->mmio_external)
>>>>> +                       etm4_os_lock(drvdata);
>>>>>                  return;
>>>>> +       }
>>>>>
>>>>> -       /* Detect the support for OS Lock before we actually use it */
>>>>> -       etm_detect_os_lock(drvdata, csa);
>>>>> +       /*
>>>>> +        * Make sure all registers are accessible
>>>>> +        * TRCOSLAR.OSLK may already be clear
>>>>> +        */
>>>>> +       if (!drvdata->mmio_external)
>>>>> +               etm4_os_unlock_csa(drvdata, csa);
>>>>>
>>>>> -       /* Make sure all registers are accessible */
>>>>> -       etm4_os_unlock_csa(drvdata, csa);
>>>>>          etm4_cs_unlock(drvdata, csa);
>>>>>
>>>>>          etm4_check_arch_features(drvdata, init_arg->pid);
>>>>> @@ -2027,6 +2042,14 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
>>>>>          init_arg.csa = &access;
>>>>>          init_arg.pid = etm_pid;
>>>>>
>>>>> +       /*
>>>>> +        * Ampere ETM v4.6 considers MMIO access as external. This mask
>>>>> +        * isolates the manufacturer JEP106 ID in the PID.
>>>>> +        * TRCPIDR2 (JEDC|DES_1) << 16 | TRCPIDR1 (DES_0) << 8)
>>>>> +        */
>>>>> +       if ((init_arg.pid & 0x000FF000) == 0x00096000)
>>>>> +               drvdata->mmio_external = true;
>>>>> +
>>>>>          /*
>>>>>           * Serialize against CPUHP callbacks to avoid race condition
>>>>>           * between the smp call and saving the delayed probe.
>>>>> @@ -2192,6 +2215,7 @@ static const struct amba_id etm4_ids[] = {
>>>>>          CS_AMBA_ID(0x000bb95e),                 /* Cortex-A57 */
>>>>>          CS_AMBA_ID(0x000bb95a),                 /* Cortex-A72 */
>>>>>          CS_AMBA_ID(0x000bb959),                 /* Cortex-A73 */
>>>>> +       CS_AMBA_UCI_ID(0x00096000, uci_id_etm4),/* Ampere ARMv8 */
>>>>>          CS_AMBA_UCI_ID(0x000bb9da, uci_id_etm4),/* Cortex-A35 */
>>>>>          CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
>>>>>          CS_AMBA_UCI_ID(0x000bbd0a, uci_id_etm4),/* Cortex-A75 */
>>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
>>>>> index 4b21bb79f168..cf4f9f2e1807 100644
>>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
>>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
>>>>> @@ -1015,6 +1015,7 @@ struct etmv4_save_state {
>>>>>    * @skip_power_up: Indicates if an implementation can skip powering up
>>>>>    *                the trace unit.
>>>>>    * @arch_features: Bitmap of arch features of etmv4 devices.
>>>>> + * @mmio_external: True if ETM considers MMIO an external access.
>>>>>    */
>>>>>   struct etmv4_drvdata {
>>>>>          void __iomem                    *base;
>>>>> @@ -1067,6 +1068,7 @@ struct etmv4_drvdata {
>>>>>          bool                            state_needs_restore;
>>>>>          bool                            skip_power_up;
>>>>>          DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX);
>>>>
>>>> Rather than continue to add bools - is it not worthwhile adding to the
>>>> bitmap above and extending the arch features API to allow a
>>>> "has_feature" call?
>>> I can look into this. I agree using a bool for every exception doesn't
>>> scale well. Referring to one Suzuki Poulose review comment, his proposal
>>> to clear TRCOSLAR.OSLK early for all parts would mean one of these bools
>>> could go away. Otherwise, possibly add one (or more) bit definitions for
>>> use by the etm4_disable_arch_specific call. The order of this call would
>>> need to change, depending.
>>>
>>>>
>>>>> +       bool                            mmio_external;
>>>>>   };
>>>>>
>>>>>   /* Address comparator access types */
>>>>> --
>>>>> 2.25.1
>>>>>
>>>> Regards
>>>>
>>>> Mike
>>
>>
>>




More information about the linux-arm-kernel mailing list