[PATCH 1/3] coresight etm4x: Early clear TRCOSLAR.OSLK prior to TRCIDR1 read
Steve Clevenger
scclevenger at os.amperecomputing.com
Mon Jan 23 11:47:51 PST 2023
On 1/23/2023 2:54 AM, Mike Leach wrote:
> Hi Steve,
>
> On Sat, 21 Jan 2023 at 07:31, Steve Clevenger
> <scclevenger at os.amperecomputing.com> wrote:
>>
>>
>> Hi Mike,
>>
>> Comments in-line.
>>
>> Steve
>>
>> On 1/20/2023 3:45 AM, Mike Leach wrote:
>>> Hi Steve,
>>>
>>> On Fri, 20 Jan 2023 at 00:52, Steve Clevenger
>>> <scclevenger at os.amperecomputing.com> wrote:
>>>>
>>>> Add Ampere early clear of ETM TRCOSLAR.OSLK prior to TRCIDR1 access.
>>>> Ampere Computing erratum AC03_DEBUG_06 describes an Ampere
>>>> Computing design decision MMIO reads are considered the same as an
>>>> external debug access. If TRCOSLAR.OSLK is set, the TRCIDR1 access
>>>> results in a bus fault followed by a kernel panic. A TRCIDR1 read
>>>> is valid regardless of TRCOSLAR.OSLK provided MMIO access
>>>> (now deprecated) is supported.
>>>> AC03_DEBUG_06 is described in the AmpereOne Developer Errata:
>>>> https://solutions.amperecomputing.com/customer-connect/products/AmpereOne-device-documentation
>>>>
>>>> Add Ampere ETM PID required for Coresight ETM driver support.
>>>>
>>>> Signed-off-by: Steve Clevenger <scclevenger at os.amperecomputing.com>
>>>> ---
>>>> .../coresight/coresight-etm4x-core.c | 36 +++++++++++++++----
>>>> drivers/hwtracing/coresight/coresight-etm4x.h | 2 ++
>>>> 2 files changed, 32 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> index 1cc052979e01..533be1928a09 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> @@ -1091,19 +1091,34 @@ static void etm4_init_arch_data(void *info)
>>>> drvdata = dev_get_drvdata(init_arg->dev);
>>>> csa = init_arg->csa;
>>>>
>>>
>>> As far as I can tell there appears to be an initialisation issue here.
>>> etm_probe()
>>> ...
>>> struct csdev_access access = { 0 };
>>> ...
>>> init_arg.csa = &access
>>>
>>> ::call=> etm4_init_arch_data(init_arg)
>>>
>>> Thus csa is uninitialised?
>> It looks to me csa is intended to be initialized to zero? In any case,
>> the Ampere check uses only the ETM pid, which is initialized directly above.
>>
>
> Sorry, I should have been more explicit.
>
> csa is the addressing abstraction used by all the underlying register
> read/write code.
>
> It is initialised to {0} in the calling code, probably to avoid the
> kernel tests complaining about uninitialised use of a variable.
>
> However in the etm4_init_csdev_access() function we are using a base
> address then it is initialised to:-
>
> struct csdev_access {
> io_mem = true;
> *base = io_mem_base_addr;
> };
>
> and in the access using system registers for an etm4 to:
>
> struct csdev_access {
> io_mem = false;
> *read = etm4x_sysreg_read()
> *write = etm4x_sysreg_write()
> };
Yes, csa is initialized indirectly in etm4_init_csdev_access via calls
to etm4_init_iomem_access and etm4_init_sysreg_access. So, you are
correct. csa is zero initialized for the call to etm4_detect_os_lock
(TRCOSLSR read) prior to my patch which clears TRCOSLAR.OSLK. The
TRCOSLSR read and TRCOSLAR.OSLK clear default to sysreg access. The csa
initialization problem is an existing bug I didn't see.
>
> Thus all underlying register access can use the correct method for the device.
>
>>>
>>>> + /* Detect the support for OS Lock before we actually use it */
>>>> + etm_detect_os_lock(drvdata, csa);
>>>> +
>
> Thus passing a 0 init csa object to the etm_detect_os_lock() fn above
> seems to be suspicious.
>
>>>> + /*
>>>> + * For ETM implementations that consider MMIO an external access
>>>> + * clear TRCOSLAR.OSLK early.
>>>> + */
>>>> + if (drvdata->mmio_external)
>>>> + etm4_os_unlock_csa(drvdata, csa);
>>>> +
>>>> /*
>>>> * If we are unable to detect the access mechanism,
>>>> * or unable to detect the trace unit type, fail
>>>> - * early.
>>>> + * early. Reset TRCOSLAR.OSLK if cleared.
>>>> */
>>>> - if (!etm4_init_csdev_access(drvdata, csa))
>>>> + if (!etm4_init_csdev_access(drvdata, csa)) {
>>>
>>> This call initialises csa according to sysreg / iomem access requirements
>
>> csa is initialized only when no drvdata->base exists.
>
> Not so - csa is initialised in both circumstances as described above.
>
>> Under what
>> circumstance would there be no ETM base given the recommended CoreSight
>> ACPI implementation? See the examples in ARM Document number: DEN0067.
>
>
> This will be used in the ETE devices (which share the etm4 driver), or
> any ETM4.6+ that uses the "arm,coresight-etm4x-sysreg" device tree
> binding (not sure what the ACPI equivalent is).
>
> So, either way, you need an init csa, before passing it to the driver calls.
Agreed. See my summary above.
>
> Later in the initialisation sequence we generate a coresight_device
> object which the csa is bound to, and finally if all is well the
> coresight_device is bound to drvdata at which point the device is
> ready for use.
>
> It is unfortunate, but to handle the two methods of register access,
> the initilialisation process for the driver has become more
> complicated with ordering dependencies - to ensure that the rest of
> the driver remains simpler when accessing device registers.
>
> As Suzuki mentioned - moving this specific lock requirement into the
> _init function would be clearer and ensure that the initialisation
> sequences were observed.
Getting back to the etm4_init_csdev_access implementation, if a
drvdata->base exists, etm4_init_sysreg_access is never called. The
driver doesn't initialize sysreg access if a memory map exists by
design. The existing Coresight ACPI specification only has the option of
describing the manufacturer trace implementation using descriptions of
memory mapped components.
>
> Regards
>
> Mike
>
>>>
>>>
>>>
>>>> + if (drvdata->mmio_external)
>>>> + etm4_os_lock(drvdata);
>>>> return;
>>>> + }
>>>>
>>>> - /* Detect the support for OS Lock before we actually use it */
>>>> - etm_detect_os_lock(drvdata, csa);
>>>> + /*
>>>> + * Make sure all registers are accessible
>>>> + * TRCOSLAR.OSLK may already be clear
>>>> + */
>>>> + if (!drvdata->mmio_external)
>>>> + etm4_os_unlock_csa(drvdata, csa);
>>>>
>>>> - /* Make sure all registers are accessible */
>>>> - etm4_os_unlock_csa(drvdata, csa);
>>>> etm4_cs_unlock(drvdata, csa);
>>>>
>>>> etm4_check_arch_features(drvdata, init_arg->pid);
>>>> @@ -2027,6 +2042,14 @@ static int etm4_probe(struct device *dev, void __iomem *base, u32 etm_pid)
>>>> init_arg.csa = &access;
>>>> init_arg.pid = etm_pid;
>>>>
>>>> + /*
>>>> + * Ampere ETM v4.6 considers MMIO access as external. This mask
>>>> + * isolates the manufacturer JEP106 ID in the PID.
>>>> + * TRCPIDR2 (JEDC|DES_1) << 16 | TRCPIDR1 (DES_0) << 8)
>>>> + */
>>>> + if ((init_arg.pid & 0x000FF000) == 0x00096000)
>>>> + drvdata->mmio_external = true;
>>>> +
>>>> /*
>>>> * Serialize against CPUHP callbacks to avoid race condition
>>>> * between the smp call and saving the delayed probe.
>>>> @@ -2192,6 +2215,7 @@ static const struct amba_id etm4_ids[] = {
>>>> CS_AMBA_ID(0x000bb95e), /* Cortex-A57 */
>>>> CS_AMBA_ID(0x000bb95a), /* Cortex-A72 */
>>>> CS_AMBA_ID(0x000bb959), /* Cortex-A73 */
>>>> + CS_AMBA_UCI_ID(0x00096000, uci_id_etm4),/* Ampere ARMv8 */
>>>> CS_AMBA_UCI_ID(0x000bb9da, uci_id_etm4),/* Cortex-A35 */
>>>> CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
>>>> CS_AMBA_UCI_ID(0x000bbd0a, uci_id_etm4),/* Cortex-A75 */
>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
>>>> index 4b21bb79f168..cf4f9f2e1807 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
>>>> @@ -1015,6 +1015,7 @@ struct etmv4_save_state {
>>>> * @skip_power_up: Indicates if an implementation can skip powering up
>>>> * the trace unit.
>>>> * @arch_features: Bitmap of arch features of etmv4 devices.
>>>> + * @mmio_external: True if ETM considers MMIO an external access.
>>>> */
>>>> struct etmv4_drvdata {
>>>> void __iomem *base;
>>>> @@ -1067,6 +1068,7 @@ struct etmv4_drvdata {
>>>> bool state_needs_restore;
>>>> bool skip_power_up;
>>>> DECLARE_BITMAP(arch_features, ETM4_IMPDEF_FEATURE_MAX);
>>>
>>> Rather than continue to add bools - is it not worthwhile adding to the
>>> bitmap above and extending the arch features API to allow a
>>> "has_feature" call?
>> I can look into this. I agree using a bool for every exception doesn't
>> scale well. Referring to one Suzuki Poulose review comment, his proposal
>> to clear TRCOSLAR.OSLK early for all parts would mean one of these bools
>> could go away. Otherwise, possibly add one (or more) bit definitions for
>> use by the etm4_disable_arch_specific call. The order of this call would
>> need to change, depending.
>>
>>>
>>>> + bool mmio_external;
>>>> };
>>>>
>>>> /* Address comparator access types */
>>>> --
>>>> 2.25.1
>>>>
>>> Regards
>>>
>>> Mike
>
>
>
More information about the linux-arm-kernel
mailing list