[PATCH v3 1/8] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device

Yicong Yang yangyicong at huawei.com
Mon Feb 14 04:51:06 PST 2022


On 2022/2/8 19:07, Yicong Yang wrote:
> On 2022/2/7 19:42, Jonathan Cameron wrote:
>> On Mon, 24 Jan 2022 21:11:11 +0800
>> Yicong Yang <yangyicong at hisilicon.com> wrote:
>>
>>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
>>> integrated Endpoint(RCiEP) device, providing the capability
>>> to dynamically monitor and tune the PCIe traffic, and trace
>>> the TLP headers.
>>>
>>> Add the driver for the device to enable the trace function.
>>> This patch adds basic function of trace, including the device's
>>> probe and initialization, functions for trace buffer allocation
>>> and trace enable/disable, register an interrupt handler to
>>> simply response to the DMA events. The user interface of trace
>>> will be added in the following patch.
>>>
>>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> Hi Yicong,
>>
>> I've not been following all the earlier discussion on this driver closely
>> so I may well raise something that has already been addressed. If so
>> just ignore the comment.
> 
> Thanks for the comments. It's ok for me to clarify it :).
> Part replies inline and I need to do some test on the others.
> 
>>
>> Thanks,
>>
>> Jonathan
>>
>>> ---
>>>  drivers/Makefile                 |   1 +
>>>  drivers/hwtracing/Kconfig        |   2 +
>>>  drivers/hwtracing/ptt/Kconfig    |  11 +
>>>  drivers/hwtracing/ptt/Makefile   |   2 +
>>>  drivers/hwtracing/ptt/hisi_ptt.c | 398 +++++++++++++++++++++++++++++++
>>>  drivers/hwtracing/ptt/hisi_ptt.h | 159 ++++++++++++
>>>  6 files changed, 573 insertions(+)
>>>  create mode 100644 drivers/hwtracing/ptt/Kconfig
>>>  create mode 100644 drivers/hwtracing/ptt/Makefile
>>>  create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c
>>>  create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h
>>>
> [...]
>>> +
>>> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt)
>>> +{
>>> +	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
>>> +	struct device *dev = &hisi_ptt->pdev->dev;
>>> +	struct hisi_ptt_dma_buffer *buffer;
>>> +	int i, ret;
>>> +
>>> +	hisi_ptt->trace_ctrl.buf_index = 0;
>>> +
>>> +	/* Make sure the trace buffer is empty before allocating */
>>
>> This comment is misleading as it suggests it not being empty is
>> a bad thing but the code handles it as an acceptable path.
>> Perhaps:
>> 	/*
>> 	 * If the trace buffer has already been allocated, zero the
>> 	 * memory.
>> 	 */
>>
> 
> will make it less misleading. thanks.
> 
>>> +	if (!list_empty(&ctrl->trace_buf)) {
>>> +		list_for_each_entry(buffer, &ctrl->trace_buf, list)
>>> +			memset(buffer->addr, 0, buffer->size);
>>> +		return 0;
>>> +	}
>>> +
>>> +	for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) {
>>> +		buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
>>> +		if (!buffer) {
>>> +			ret = -ENOMEM;
>>> +			goto err;
>>> +		}
>>> +
>>> +		buffer->addr = dma_alloc_coherent(dev, ctrl->buffer_size,
>>> +						  &buffer->dma, GFP_KERNEL);
>>> +		if (!buffer->addr) {
>>> +			kfree(buffer);
>>> +			ret = -ENOMEM;
>>> +			goto err;
>>> +		}
>>> +
>>> +		memset(buffer->addr, 0, buffer->size);
>> See:
>> https://lore.kernel.org/lkml/20190108130701.14161-4-hch@lst.de/
>> dma_alloc_coherent() always zeros the memory for us hence there
>> is no longer a dma_kzalloc_coherent()
>>
> 
> thanks for the information. Then the memset here is redundant and will drop it.
> 
>>> +
>>> +		buffer->index = i;
>>
>> Carrying an index inside a list which corresponds directly
>> to the position in the list is not particularly nice.
>> Why can't we compute this index on the fly where the list
>> is walked?  Or am I misunderstanding and the order of the buffers
>> is changed in a later patch?
>>
> 
> The index is fixed once allocated and I stored it to avoid later
> computing. But seems it's highly recommended to compute these sort
> of things on the fly when necessary. John recommends the same things
> on some other places so I think I can get these addressed.
> 
>> As a side note, is a list actually appropriate when we always
>> have 4 of these buffers?  Feels like an array of buffer
>> structures might be cheaper.
>>

As suggested here and below, I tried to maintianed the buffers with
an array instead of a list and it looks more straightforward and some
fields of buffer structure can also be dropped. So I think I can change
to use an array.

Thanks for the suggestion!

Yicong

>>> +		buffer->size = ctrl->buffer_size;
>>> +		list_add_tail(&buffer->list, &ctrl->trace_buf);
>>> +	}
>>> +
>>> +	return 0;
>>> +err:
>>> +	hisi_ptt_free_trace_buf(hisi_ptt);
>>> +	return ret;
>>> +}
>>> +
>>> +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt)
>>> +{
>>> +	writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +	hisi_ptt->trace_ctrl.status = HISI_PTT_TRACE_STATUS_OFF;
>>> +}
>>> +
>>> +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt)
>>> +{
>>> +	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
>>> +	struct hisi_ptt_dma_buffer *cur;
>>> +	u32 val;
>>> +
>>> +	/* Check device idle before start trace */
>>> +	if (hisi_ptt_wait_trace_hw_idle(hisi_ptt)) {
>>> +		pci_err(hisi_ptt->pdev, "Failed to start trace, the device is still busy.\n");
>>> +		return -EBUSY;
>>> +	}
>>> +
>>> +	/* Reset the DMA before start tracing */
>>> +	val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +	val |= HISI_PTT_TRACE_CTRL_RST;
>>> +	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +
>>> +	/*
>>> +	 * We'll be in the perf context where preemption is disabled,
>>> +	 * so use busy loop here.
>>> +	 */
>>> +	mdelay(HISI_PTT_RESET_WAIT_MS);
>>
>> Busy look for 1 second?  Ouch.  If we can reduce this in any way
>> that would be great or if there is a means to do it before
>> we disable preemption.
>>
> 
> It's inherited from the previous version that was using msleep() and it's
> somehow unacceptable in an atomic context I think. The reset here is
> going to reset the write pointer of the hardware DMA so we can check the
> whether the pointer before dereset it. I confirmed with our hardware
> teams that it can be reduced to 10us. So I'll poll the write pointer register
> for about 10us before continue here.
> 
> thanks for catching this!
> 
>>> +
>>> +	val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +	val &= ~HISI_PTT_TRACE_CTRL_RST;
>>> +	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +
>>> +	/* Clear the interrupt status */
>>> +	writel(HISI_PTT_TRACE_INT_STAT_MASK, hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT);
>>> +	writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK);
>>> +
>>> +	/* Configure the trace DMA buffer */
>>> +	list_for_each_entry(cur, &ctrl->trace_buf, list) {
>>
>> I comment on the use of cur->index above.  Here it would be easy to compute
>> the index as we go for example assuming we never end up with holes
>> in the list.
>>
> 
> ok.
> 
>>> +		writel(lower_32_bits(cur->dma),
>>> +		       hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_LO_0 +
>>> +		       cur->index * HISI_PTT_TRACE_ADDR_STRIDE);
>>> +		writel(upper_32_bits(cur->dma),
>>> +		       hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_HI_0 +
>>> +		       cur->index * HISI_PTT_TRACE_ADDR_STRIDE);
>>> +	}
>>> +	writel(ctrl->buffer_size, hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_SIZE);
>>> +
>>> +	/* Set the trace control register */
>>> +	val = FIELD_PREP(HISI_PTT_TRACE_CTRL_TYPE_SEL, ctrl->type);
>>> +	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_RXTX_SEL, ctrl->direction);
>>> +	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_DATA_FORMAT, ctrl->format);
>>> +	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TARGET_SEL, hisi_ptt->trace_ctrl.filter);
>>> +	if (!hisi_ptt->trace_ctrl.is_port)
>>> +		val |= HISI_PTT_TRACE_CTRL_FILTER_MODE;
>>> +
>>> +	/* Start the Trace */
>>> +	val |= HISI_PTT_TRACE_CTRL_EN;
>>> +	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
>>> +
>>> +	ctrl->status = HISI_PTT_TRACE_STATUS_ON;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>
>> ...
>>
>>> +
>>> +static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
>>> +{
>>> +	struct pci_dev *pdev = hisi_ptt->pdev;
>>> +	struct pci_bus *bus;
>>> +	u32 reg;
>>> +
>>> +	INIT_LIST_HEAD(&hisi_ptt->port_filters);
>>> +	INIT_LIST_HEAD(&hisi_ptt->req_filters);
>>> +
>>> +	/*
>>> +	 * The device range register provides the information about the
>>> +	 * root ports which the RCiEP can control and trace. The RCiEP
>>> +	 * and the root ports it support are on the same PCIe core, with
>>> +	 * same domain number but maybe different bus number. The device
>>> +	 * range register will tell us which root ports we can support,
>>> +	 * Bit[31:16] indicates the upper BDF numbers of the root port,
>>> +	 * while Bit[15:0] indicates the lower.
>>> +	 */
>>> +	reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE);
>>> +	hisi_ptt->upper = reg >> 16;
>>> +	hisi_ptt->lower = reg & 0xffff;
>> Trivial:
>> Perhaps worthing define HISI_PTT_DEVICE_RANGE_UPPER_MASK etc adn using
>> FIELD_GET?
>>
> 
> sure.
> 
>>> +
>>> +	reg = readl(hisi_ptt->iobase + HISI_PTT_LOCATION);
>>> +	hisi_ptt->core_id = FIELD_GET(HISI_PTT_CORE_ID, reg);
>>> +	hisi_ptt->sicl_id = FIELD_GET(HISI_PTT_SICL_ID, reg);
>>> +
>>> +	bus = pci_find_bus(pci_domain_nr(pdev->bus), PCI_BUS_NUM(hisi_ptt->upper));
>>> +	if (bus)
>>> +		pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
>>> +
>>> +	/* Initialize trace controls */
>>> +	INIT_LIST_HEAD(&hisi_ptt->trace_ctrl.trace_buf);
>>> +	hisi_ptt->trace_ctrl.buffer_size = HISI_PTT_TRACE_BUF_SIZE;
>>> +	hisi_ptt->trace_ctrl.default_cpu = cpumask_first(cpumask_of_node(dev_to_node(&pdev->dev)));
>>> +}
>>> +
> [...]
>>> +
>>> +#define HISI_PCIE_CORE_PORT_ID(devfn)	(PCI_FUNC(devfn) << 1)
>>> +
>>> +enum hisi_ptt_trace_status {
>>> +	HISI_PTT_TRACE_STATUS_OFF = 0,
>>> +	HISI_PTT_TRACE_STATUS_ON,
>>> +};
>>
>> Why not just use a boolean given we only have off and on states?
>>
> 
> An enum may make the code more readable I think.
> 
> Thanks,
> Yicong
> 
> .
> 



More information about the linux-arm-kernel mailing list