[EXTERNAL] Re: [PATCH v7 5/7] coresight: tmc: Add support for reading crash data

Mon Apr 15 02:28:36 PDT 2024

On 15/04/2024 05:01, Linu Cherian wrote:
> Hi James,
> 
>> -----Original Message-----
>> From: James Clark <james.clark at arm.com>
>> Sent: Friday, April 12, 2024 3:36 PM
>> To: Linu Cherian <lcherian at marvell.com>; Suzuki K Poulose
>> <suzuki.poulose at arm.com>
>> Cc: linux-arm-kernel at lists.infradead.org; coresight at lists.linaro.org; linux-
>> kernel at vger.kernel.org; robh+dt at kernel.org;
>> krzysztof.kozlowski+dt at linaro.org; conor+dt at kernel.org;
>> devicetree at vger.kernel.org; Sunil Kovvuri Goutham
>> <sgoutham at marvell.com>; George Cherian <gcherian at marvell.com>; Anil
>> Kumar Reddy H <areddy3 at marvell.com>; Tanmay Jagdale
>> <tanmay at marvell.com>; mike.leach at linaro.org; leo.yan at linaro.org
>> Subject: [EXTERNAL] Re: [PATCH v7 5/7] coresight: tmc: Add support for
>> reading crash data
>>
>> Prioritize security for external emails: Confirm sender and content safety
>> before clicking links or opening attachments
>>
>> ----------------------------------------------------------------------
>>
>>
>> On 07/03/2024 03:36, Linu Cherian wrote:
>>> * Introduce a new mode CS_MODE_READ_CRASHDATA for reading trace
>>>   captured in previous crash/watchdog reset.
>>>
>>> * Add special device files for reading ETR/ETF crash data.
>>>
>>> * User can read the crash data as below
>>>
>>>   For example, for reading crash data from tmc_etf sink
>>>
>>>   #dd if=/dev/crash_tmc_etfXX of=~/cstrace.bin
>>>
>>> Signed-off-by: Anil Kumar Reddy <areddy3 at marvell.com>
>>> Signed-off-by: Tanmay Jagdale <tanmay at marvell.com>
>>> Signed-off-by: Linu Cherian <lcherian at marvell.com>
>>> ---
>>> Changelog from v6:
>>> * Removed read_prevboot flag in sysfs
>>> * Added special device files for reading crashdata
>>> * Renamed CS mode READ_PREVBOOT to READ_CRASHDATA
>>> * Setting the READ_CRASHDATA mode is done as part of file open.
>>>
>>
>> [...]
>>
>>> @@ -619,6 +740,19 @@ static int tmc_probe(struct amba_device *adev,
>> const struct amba_id *id)
>>>  		coresight_unregister(drvdata->csdev);
>>>  	else
>>>  		pm_runtime_put(&adev->dev);
>>> +
>>> +	if (!is_tmc_reserved_region_valid(dev))
>>> +		goto out;
>>> +
>>> +	drvdata->crashdev.name =
>>> +		devm_kasprintf(dev, GFP_KERNEL, "%s_%s", "crash",
>> desc.name);
>>> +	drvdata->crashdev.minor = MISC_DYNAMIC_MINOR;
>>> +	drvdata->crashdev.fops = &tmc_crashdata_fops;
>>> +	ret = misc_register(&drvdata->crashdev);
>>> +	if (ret)
>>> +		pr_err("%s: Failed to setup dev interface for crashdata\n",
>>> +		       desc.name);
>>> +
>>
>> Is this all optional after the is_tmc_reserved_region_valid()? Skipping to out
>> seems to be more like an error condition, but in this case it's not? Having it
>> like this makes it more difficult to add extra steps to the probe function. You
>> could move it to a function and flip the condition which would be clearer:
>>
> 
> Ack.
> 
>>    if (is_tmc_reserved_region_valid(dev))
>>       register_crash_dev_interface(drvdata);
>>
>>
>>>  out:
>>>  	return ret;
>>>  }
>>
>> [...]
>>
>>>
>>> +static int tmc_etr_setup_crashdata_buf(struct tmc_drvdata *drvdata) {
>>> +	int rc = 0;
>>> +	u64 trace_addr;
>>> +	struct etr_buf *etr_buf;
>>> +	struct etr_flat_buf *resrv_buf;
>>> +	struct tmc_crash_metadata *mdata;
>>> +	struct device *dev = &drvdata->csdev->dev;
>>> +
>>> +	mdata = drvdata->crash_mdata.vaddr;
>>> +
>>> +	etr_buf = kzalloc(sizeof(*etr_buf), GFP_ATOMIC);
>>> +	if (!etr_buf) {
>>> +		rc = -ENOMEM;
>>> +		goto out;
>>> +	}
>>> +	etr_buf->size = drvdata->crash_tbuf.size;
>>> +
>>> +	resrv_buf = kzalloc(sizeof(*resrv_buf), GFP_ATOMIC);
>>> +	if (!resrv_buf) {
>>> +		rc = -ENOMEM;
>>> +		goto rmem_err;
>>> +	}
>>> +
>>> +	/*
>>> +	 * Buffer address given by metadata for retrieval of trace data
>>> +	 * from previous boot is expected to be same as the reserved
>>> +	 * trace buffer memory region provided through DTS
>>> +	 */
>>> +	if (is_tmc_reserved_region_valid(dev->parent)
>>> +	    && (drvdata->crash_tbuf.paddr == mdata->trc_paddr))
>>> +		resrv_buf->vaddr = drvdata->crash_tbuf.vaddr;
>>> +	else {
>>> +		dev_dbg(dev, "Trace buffer address of previous boot
>> invalid\n");
>>> +		rc = -EINVAL;
>>> +		goto map_err;
>>> +	}
>>> +
>>> +	resrv_buf->size = etr_buf->size;
>>> +	resrv_buf->dev = &drvdata->csdev->dev;
>>> +	etr_buf->hwaddr = trace_addr;
>>
>> trace_addr is uninitialised at this point. If you pull the latest coresight branch
>> we enabled some extra compiler warnings which catch this.
>>
>> I assume it's supposed to be mdata->trc_paddr?
> 
> 
> Since no DMA operation happens here, its supposed to be kept NULL.
> Since it was redundant for crash data read operation, missed catching this. Will fix this.
> 
>>
>> Is there some kind of test that could be added that could have caught this?
>> Maybe skip the full reboot flow but just enable the feature and see if we can
>> read from the buffer.
> 
> As pointed above this field is redundant for crashdata read mode. So its not a functionality breakage as such.
> 

Ah ok that's not as bad as I thought then. I went back to version 4 or 5
and thought I saw it was assigned so assumed there was a mistake.

>>
>> Or even just toggling the mode on and off via sysfs. We're running the Perf
>> and kselftests on Juno internally so I can add a reserved region to the Juno DT
>> and make sure this stays working.
> 
> Did you meant adding a kselftest for tracing and reading back tracedata in sysfs mode using the reserved region ?
> 

Yes just enable, disable then read back. I don't think we need to do a
decode, but make sure a non zero size is read. And the test can be
skipped if the reserved region doesn't exist.

> Thanks
> Linu Cherian. 
>