[PATCH v10 6/6] drivers: remoteproc: Add Xilinx r5 remoteproc driver

Mathieu Poirier mathieu.poirier at linaro.org
Tue Nov 1 15:05:32 PDT 2022


On Tue, Oct 11, 2022 at 02:25:01PM -0700, Tanmay Shah wrote:
> This driver enables r5f dual core Real time Processing Unit subsystem
> available on Xilinx Zynq Ultrascale MPSoC Platform. RPU subsystem
> (cluster) can be configured in different modes e.g. split mode in which
> two r5f cores work independent of each other and lock-step mode in which
> both r5f cores execute same code clock-for-clock and notify if the
> result is different.
> 
> The Xilinx r5 Remoteproc Driver boots the RPU cores via calls to the Xilinx
> Platform Management Unit that handles the R5 configuration, memory access
> and R5 lifecycle management. The interface to this manager is done in this
> driver via zynqmp_pm_* function calls.
> 
> Signed-off-by: Ben Levinsky <ben.levinsky at amd.com>
> Signed-off-by: Tanmay Shah <tanmay.shah at amd.com>
> Reported-by: kernel test robot <lkp at intel.com>
> ---
> 
> Changes in v10:
>   - fix Kconfig unmet dependecy error reported by kernel test robot
>   - fix r5_rproc object mem leak in function zynqmp_r5_add_rproc_core
>   - add explanation of hardcoded TCM nodes
>   - remove redundant ToDo comment
>   - remove redundant check of tcm_bank_count and rmem_count
>   - remove explicit free reserved_mem in zynqmp_r5_get_mem_region_node
>   - fix leaked reference of child_dev during zynqmp_r5_cluster_init
>     Also fix possible crash in exit path release_r5_cores 
>   - do not remove mem-region and tcm carveouts explicitly in case of failure.
>     It will be deleted as part of rproc_del. This also simplifies logic to
>     use rproc_add_carveout
>   - fix documentation all over the driver
> 
> Changes in v9
>   - replace devm_rproc_alloc with rproc_alloc
>   - %s/until/while/r
>   - remove extra line
>   - %s/i > -1/i >=0/r fix this throughout the driver
>   - fix type of tcm_mode from int to enum rpu_tcm_comb
>   - release &child_pdev->dev references in case of failure
>   - remove zynqmp_r5_core_exit()
>   - undefined memory-region property isn't driver failure
>   - remove tcm bank count check from ops
>   - fix tcm bank turn-off sequence
>   - fix parse_fw function documentation
>   - do not use rproc_mem_entry_init on vdev0buffers
>   - declare variabls in reverse xmas tree order throughout    the driver
> 
> Changes in v8:
>   - None
> 
> Changes in v7:
>   - None
> 
> Changes in v6:
>   - None
> 
> Changes in v5:
>   - None
> 
> Changes in v4:
>   - Remove redundant header files
>   - use dev_err_probe() to report errors during probe
>   - Fix missing check on error code returned by zynqmp_r5_add_rproc_core()
>   - Fix memory leaks all over the driver when resource allocation fails for any core
>   - make cluster mode check only at one place
>   - remove redundant initialization of variable
>   - remove redundant use of of_node_put() 
>   - Fix Comment format problem
>   - Assign offset of zynqmp_tcm_banks instead of duplicating it
>   - Add tcm and memory regions rproc carveouts during prepare instead of parse_fw
>   - Remove rproc_mem_entry object from r5_core
>   - Use put_device() and rproc_del() APIs to fix memory leaks
>   - Replace pr_* with dev_*. This was missed in v3, fix now.
>   - Use "GPL" instead of "GPL v2" in MODULE_LICENSE macro. This was suggested by checkpatch script.
> 
> Changes in v3:
>   - Fix checkpatch script indentation warning
>   - Remove unused variable from xilinx remoteproc driver
>   - use C style comments, i.e /*...*/
>   - Remove redundant debug information which can be derived using /proc/device-tree
>   - Fix multilined comment format
>   - s/"final fot TCM"/"final for TCM"
>   - Function devm_kzalloc() does not return an code on error, just NULL.
>     Remove redundant error check for this function throughout the driver.
>   - Fix RPU mode configuration and add documentation accordingly
>   - Get rid of the indentations to match function documentation style with rest of the driver
>   - Fix memory leak by only using r5_rproc->priv and not replace it with new instance
>   - Use 'i' for the outer loop and 'j' for the inner one as per convention
>   - Remove redundant error and NULL checks throughout the driver
>   - Use devm_kcalloc() when more than one element is required
>   - Add memory-regions carveouts during driver probe instead of parse_fw call
>     This removes redundant copy of reserved_mem object in r5_core structure.
>   - Fix memory leak by using of_node_put()
>   - Fix indentation of tcm_mem_map function args
>   - Remove redundant init of variables
>   - Initialize tcm bank size variable for lockstep mode
>   - Replace u32 with phys_addr_t for variable stroing memory bank address
>   - Add documentation of TCM behavior in lockstep mode
>   - Use dev_get_drvdata instead of platform driver API
>   - Remove info level messages
>   - Fix checkpatch.pl warnings
>   - Add documentation for the Xilinx r5f platform to understand driver design
> 
>  drivers/remoteproc/Kconfig              |   13 +
>  drivers/remoteproc/Makefile             |    1 +
>  drivers/remoteproc/xlnx_r5_remoteproc.c | 1066 +++++++++++++++++++++++
>  3 files changed, 1080 insertions(+)
>  create mode 100644 drivers/remoteproc/xlnx_r5_remoteproc.c
> 
> diff --git a/drivers/remoteproc/Kconfig b/drivers/remoteproc/Kconfig
> index 166019786653..a850e9f486dd 100644
> --- a/drivers/remoteproc/Kconfig
> +++ b/drivers/remoteproc/Kconfig
> @@ -352,6 +352,19 @@ config TI_K3_R5_REMOTEPROC
>  	  It's safe to say N here if you're not interested in utilizing
>  	  a slave processor.
>  
> +config XLNX_R5_REMOTEPROC
> +	tristate "Xilinx R5 remoteproc support"
> +	depends on PM && ARCH_ZYNQMP
> +	select ZYNQMP_FIRMWARE
> +	select RPMSG_VIRTIO
> +	select MAILBOX
> +	select ZYNQMP_IPI_MBOX
> +	help
> +	  Say y or m here to support Xilinx R5 remote processors via the remote
> +	  processor framework.
> +
> +	  It's safe to say N if not interested in using RPU r5f cores.
> +
>  endif # REMOTEPROC
>  
>  endmenu
> diff --git a/drivers/remoteproc/Makefile b/drivers/remoteproc/Makefile
> index 5478c7cb9e07..91314a9b43ce 100644
> --- a/drivers/remoteproc/Makefile
> +++ b/drivers/remoteproc/Makefile
> @@ -38,3 +38,4 @@ obj-$(CONFIG_ST_SLIM_REMOTEPROC)	+= st_slim_rproc.o
>  obj-$(CONFIG_STM32_RPROC)		+= stm32_rproc.o
>  obj-$(CONFIG_TI_K3_DSP_REMOTEPROC)	+= ti_k3_dsp_remoteproc.o
>  obj-$(CONFIG_TI_K3_R5_REMOTEPROC)	+= ti_k3_r5_remoteproc.o
> +obj-$(CONFIG_XLNX_R5_REMOTEPROC)	+= xlnx_r5_remoteproc.o
> diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c
> new file mode 100644
> index 000000000000..dc7d75c1070a
> --- /dev/null
> +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c
> @@ -0,0 +1,1066 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ZynqMP R5 Remote Processor driver
> + *
> + */
> +
> +#include <dt-bindings/power/xlnx-zynqmp-power.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/firmware/xlnx-zynqmp.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_platform.h>
> +#include <linux/of_reserved_mem.h>
> +#include <linux/platform_device.h>
> +#include <linux/remoteproc.h>
> +#include <linux/slab.h>
> +
> +#include "remoteproc_internal.h"
> +
> +/*
> + * settings for RPU cluster mode which
> + * reflects possible values of xlnx,cluster-mode dt-property
> + */
> +enum zynqmp_r5_cluster_mode {
> +	SPLIT_MODE = 0, /* When cores run as separate processor */
> +	LOCKSTEP_MODE = 1, /* cores execute same code in lockstep,clk-for-clk */
> +	SINGLE_CPU_MODE = 2, /* core0 is held in reset and only core1 runs */
> +};
> +
> +/**
> + * struct mem_bank_data - Memory Bank description
> + *
> + * @addr: Start address of memory bank
> + * @size: Size of Memory bank
> + * @pm_domain_id: Power-domains id of memory bank for firmware to turn on/off
> + * @bank_name: name of the bank for remoteproc framework
> + */
> +struct mem_bank_data {
> +	phys_addr_t addr;
> +	size_t size;
> +	u32 pm_domain_id;
> +	char *bank_name;
> +};
> +
> +/*
> + * Hardcoded TCM bank values. This will be removed once TCM bindings are
> + * accepted for system-dt specifications and upstreamed in linux kernel
> + */
> +static const struct mem_bank_data zynqmp_tcm_banks[] = {
> +	{0xffe00000UL, 0x10000UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each */
> +	{0xffe20000UL, 0x10000UL, PD_R5_0_BTCM, "btcm0"},
> +	{0xffe90000UL, 0x10000UL, PD_R5_1_ATCM, "atcm1"},
> +	{0xffeb0000UL, 0x10000UL, PD_R5_1_BTCM, "btcm1"},
> +};
> +
> +/**
> + * struct zynqmp_r5_core
> + *
> + * @dev: device of RPU instance
> + * @np: device node of RPU instance
> + * @tcm_bank_count: number TCM banks accessible to this RPU
> + * @tcm_banks: array of each TCM bank data
> + * @rmem_count: Number of reserved mem regions
> + * @rmem: reserved memory region nodes from device tree
> + * @rproc: rproc handle
> + * @pm_domain_id: RPU CPU power domain id
> + */
> +struct zynqmp_r5_core {
> +	struct device *dev;
> +	struct device_node *np;
> +	int tcm_bank_count;
> +	struct mem_bank_data **tcm_banks;
> +	int rmem_count;
> +	struct reserved_mem **rmem;
> +	struct rproc *rproc;
> +	u32 pm_domain_id;
> +};
> +
> +/**
> + * struct zynqmp_r5_cluster
> + *
> + * @dev: r5f subsystem cluster device node
> + * @mode: cluster mode of type zynqmp_r5_cluster_mode
> + * @core_count: number of r5 cores used for this cluster mode
> + * @r5_cores: Array of pointers pointing to r5 core
> + */
> +struct zynqmp_r5_cluster {
> +	struct device *dev;
> +	enum  zynqmp_r5_cluster_mode mode;
> +	int core_count;
> +	struct zynqmp_r5_core **r5_cores;
> +};
> +
> +/*
> + * zynqmp_r5_set_mode()
> + *
> + * set RPU cluster and TCM operation mode
> + *
> + * @r5_core: pointer to zynqmp_r5_core type object
> + * @fw_reg_val: value expected by firmware to configure RPU cluster mode
> + * @tcm_mode: value expected by fw to configure TCM mode (lockstep or split)
> + *
> + * Return: 0 for success and < 0 for failure
> + */
> +static int zynqmp_r5_set_mode(struct zynqmp_r5_core *r5_core,
> +			      enum rpu_oper_mode fw_reg_val,
> +			      enum rpu_tcm_comb tcm_mode)
> +{
> +	int ret;
> +
> +	ret = zynqmp_pm_set_rpu_mode(r5_core->pm_domain_id, fw_reg_val);
> +	if (ret < 0) {
> +		dev_err(r5_core->dev, "failed to set RPU mode\n");
> +		return ret;
> +	}
> +
> +	ret = zynqmp_pm_set_tcm_config(r5_core->pm_domain_id, tcm_mode);
> +	if (ret < 0)
> +		dev_err(r5_core->dev, "failed to configure TCM\n");
> +
> +	return ret;
> +}
> +
> +/*
> + * zynqmp_r5_rproc_start()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * Start R5 Core from designated boot address.
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int zynqmp_r5_rproc_start(struct rproc *rproc)
> +{
> +	struct zynqmp_r5_core *r5_core = rproc->priv;
> +	enum rpu_boot_mem bootmem;
> +	int ret;
> +
> +	/*
> +	 * The exception vector pointers (EVP) refer to the base-address of
> +	 * exception vectors (for reset, IRQ, FIQ, etc). The reset-vector
> +	 * starts at the base-address and subsequent vectors are on 4-byte
> +	 * boundaries.
> +	 *
> +	 * Exception vectors can start either from 0x0000_0000 (LOVEC) or
> +	 * from 0xFFFF_0000 (HIVEC) which is mapped in the OCM (On-Chip Memory)
> +	 *
> +	 * Usually firmware will put Exception vectors at LOVEC.
> +	 *
> +	 * It is not recommend that you change the exception vector.
> +	 * Changing the EVP to HIVEC will result in increased interrupt latency
> +	 * and jitter. Also, if the OCM is secured and the Cortex-R5F processor
> +	 * is non-secured, then the Cortex-R5F processor cannot access the
> +	 * HIVEC exception vectors in the OCM.
> +	 */
> +	bootmem = (rproc->bootaddr >= 0xFFFC0000) ?
> +		   PM_RPU_BOOTMEM_HIVEC : PM_RPU_BOOTMEM_LOVEC;
> +
> +	dev_dbg(r5_core->dev, "RPU boot addr 0x%llx from %s.", rproc->bootaddr,
> +		bootmem == PM_RPU_BOOTMEM_HIVEC ? "OCM" : "TCM");
> +
> +	ret = zynqmp_pm_request_wake(r5_core->pm_domain_id, 1,
> +				     bootmem, ZYNQMP_PM_REQUEST_ACK_NO);
> +	if (ret)
> +		dev_err(r5_core->dev,
> +			"failed to start RPU = 0x%x\n", r5_core->pm_domain_id);
> +	return ret;
> +}
> +
> +/*
> + * zynqmp_r5_rproc_stop()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * Power down  R5 Core.
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int zynqmp_r5_rproc_stop(struct rproc *rproc)
> +{
> +	struct zynqmp_r5_core *r5_core = rproc->priv;
> +	int ret;
> +
> +	ret = zynqmp_pm_force_pwrdwn(r5_core->pm_domain_id,
> +				     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
> +	if (ret)
> +		dev_err(r5_core->dev, "failed to stop remoteproc RPU %d\n", ret);
> +
> +	return ret;
> +}
> +
> +/*
> + * zynqmp_r5_mem_region_map()
> + * @rproc: single R5 core's corresponding rproc instance
> + * @mem: mem descriptor to map reserved memory-regions
> + *
> + * Callback to map va for memory-region's carveout.
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int zynqmp_r5_mem_region_map(struct rproc *rproc,
> +				    struct rproc_mem_entry *mem)
> +{
> +	void __iomem *va;
> +
> +	va = ioremap_wc(mem->dma, mem->len);
> +	if (IS_ERR_OR_NULL(va))
> +		return -ENOMEM;
> +
> +	mem->va = (void *)va;
> +
> +	return 0;
> +}
> +
> +/*
> + * zynqmp_r5_rproc_mem_unmap
> + * @rproc: single R5 core's corresponding rproc instance
> + * @mem: mem entry to unmap
> + *
> + * Unmap memory-region carveout
> + *
> + * return: always returns 0
> + */
> +static int zynqmp_r5_mem_region_unmap(struct rproc *rproc,
> +				      struct rproc_mem_entry *mem)
> +{
> +	iounmap((void __iomem *)mem->va);
> +	return 0;
> +}
> +
> +/*
> + * add_mem_regions_carveout()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * Construct rproc mem carveouts from memory-region property nodes
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int add_mem_regions_carveout(struct rproc *rproc)
> +{
> +	struct rproc_mem_entry *rproc_mem;
> +	struct zynqmp_r5_core *r5_core;
> +	struct reserved_mem *rmem;
> +	int i, num_mem_regions;
> +
> +	r5_core = (struct zynqmp_r5_core *)rproc->priv;
> +	num_mem_regions = r5_core->rmem_count;
> +
> +	for (i = 0; i < num_mem_regions; i++) {
> +		rmem = r5_core->rmem[i];
> +
> +		if (!strncmp(rmem->name, "vdev0buffer", strlen("vdev0buffer"))) {
> +			/* Init reserved memory for vdev buffer */
> +			rproc_mem = rproc_of_resm_mem_entry_init(&rproc->dev, i,
> +								 rmem->size,
> +								 rmem->base,
> +								 rmem->name);
> +		} else {
> +			/* Register associated reserved memory regions */
> +			rproc_mem = rproc_mem_entry_init(&rproc->dev, NULL,
> +							 (dma_addr_t)rmem->base,
> +							 rmem->size, rmem->base,
> +							 zynqmp_r5_mem_region_map,
> +							 zynqmp_r5_mem_region_unmap,
> +							 rmem->name);
> +		}
> +
> +		if (!rproc_mem)
> +			return -ENOMEM;
> +
> +		rproc_add_carveout(rproc, rproc_mem);
> +
> +		dev_dbg(&rproc->dev, "reserved mem carveout %s addr=%llx, size=0x%llx",
> +			rmem->name, rmem->base, rmem->size);
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * tcm_mem_unmap()
> + * @rproc: single R5 core's corresponding rproc instance
> + * @mem: tcm mem entry to unmap
> + *
> + * Unmap TCM banks when powering down R5 core.
> + *
> + * return always 0
> + */
> +static int tcm_mem_unmap(struct rproc *rproc, struct rproc_mem_entry *mem)
> +{
> +	iounmap((void __iomem *)mem->va);
> +
> +	return 0;
> +}
> +
> +/*
> + * tcm_mem_map()
> + * @rproc: single R5 core's corresponding rproc instance
> + * @mem: tcm memory entry descriptor
> + *
> + * Given TCM bank entry, this func setup virtual address for TCM bank
> + * remoteproc carveout. It also takes care of va to da address translation
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int tcm_mem_map(struct rproc *rproc,
> +		       struct rproc_mem_entry *mem)
> +{
> +	void __iomem *va;
> +
> +	va = ioremap_wc(mem->dma, mem->len);
> +	if (IS_ERR_OR_NULL(va))
> +		return -ENOMEM;
> +
> +	/* Update memory entry va */
> +	mem->va = (void *)va;
> +
> +	/* clear TCMs */
> +	memset_io(va, 0, mem->len);
> +
> +	/*
> +	 * The R5s expect their TCM banks to be at address 0x0 and 0x2000,
> +	 * while on the Linux side they are at 0xffexxxxx.
> +	 *
> +	 * Zero out the high 12 bits of the address. This will give
> +	 * expected values for TCM Banks 0A and 0B (0x0 and 0x20000).
> +	 */
> +	mem->da &= 0x000fffff;
> +
> +	/*
> +	 * TCM Banks 1A and 1B still have to be translated.
> +	 *
> +	 * Below handle these two banks' absolute addresses (0xffe90000 and
> +	 * 0xffeb0000) and convert to the expected relative addresses
> +	 * (0x0 and 0x20000).
> +	 */
> +	if (mem->da == 0x90000 || mem->da == 0xB0000)
> +		mem->da -= 0x90000;
> +
> +	/* if translated TCM bank address is not valid report error */
> +	if (mem->da != 0x0 && mem->da != 0x20000) {
> +		dev_err(&rproc->dev, "invalid TCM address: %x\n", mem->da);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
> +/*
> + * add_tcm_carveout_split_mode()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * allocate and add remoteproc carveout for TCM memory in split mode
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int add_tcm_carveout_split_mode(struct rproc *rproc)
> +{
> +	struct rproc_mem_entry *rproc_mem;
> +	struct zynqmp_r5_core *r5_core;
> +	int i, num_banks, ret;
> +	phys_addr_t bank_addr;
> +	struct device *dev;
> +	u32 pm_domain_id;
> +	size_t bank_size;
> +	char *bank_name;
> +
> +	r5_core = (struct zynqmp_r5_core *)rproc->priv;
> +	dev = r5_core->dev;
> +	num_banks = r5_core->tcm_bank_count;
> +
> +	/*
> +	 * Power-on Each 64KB TCM,
> +	 * register its address space, map and unmap functions
> +	 * and add carveouts accordingly
> +	 */
> +	for (i = 0; i < num_banks; i++) {
> +		bank_addr = r5_core->tcm_banks[i]->addr;
> +		bank_name = r5_core->tcm_banks[i]->bank_name;
> +		bank_size = r5_core->tcm_banks[i]->size;
> +		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
> +
> +		ret = zynqmp_pm_request_node(pm_domain_id,
> +					     ZYNQMP_PM_CAPABILITY_ACCESS, 0,
> +					     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
> +		if (ret < 0) {
> +			dev_err(dev, "failed to turn on TCM 0x%x", pm_domain_id);
> +			goto release_tcm_split;
> +		}
> +
> +		dev_dbg(dev, "TCM carveout split mode %s addr=%llx, size=0x%lx",
> +			bank_name, bank_addr, bank_size);
> +
> +		rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr,
> +						 bank_size, bank_addr,
> +						 tcm_mem_map, tcm_mem_unmap,
> +						 bank_name);
> +		if (!rproc_mem) {
> +			ret = -ENOMEM;
> +			zynqmp_pm_release_node(pm_domain_id);
> +			goto release_tcm_split;
> +		}
> +
> +		rproc_add_carveout(rproc, rproc_mem);
> +	}
> +
> +	return 0;
> +
> +release_tcm_split:
> +	/* If failed, Turn off all TCM banks turned on before */
> +	for (i--; i >= 0; i--) {
> +		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
> +		zynqmp_pm_release_node(pm_domain_id);
> +	}
> +	return ret;
> +}
> +
> +/*
> + * add_tcm_carveout_lockstep_mode()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * allocate and add remoteproc carveout for TCM memory in lockstep mode
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int add_tcm_carveout_lockstep_mode(struct rproc *rproc)
> +{
> +	struct rproc_mem_entry *rproc_mem;
> +	struct zynqmp_r5_core *r5_core;
> +	int i, num_banks, ret;
> +	phys_addr_t bank_addr;
> +	size_t bank_size = 0;
> +	struct device *dev;
> +	u32 pm_domain_id;
> +	char *bank_name;
> +
> +	r5_core = (struct zynqmp_r5_core *)rproc->priv;
> +	dev = r5_core->dev;
> +
> +	/* Go through zynqmp banks for r5 node */
> +	num_banks = r5_core->tcm_bank_count;
> +
> +	/*
> +	 * In lockstep mode, TCM is contiguous memory block
> +	 * However, each TCM block still needs to be enabled individually.
> +	 * So, Enable each TCM block individually, but add their size
> +	 * to create contiguous memory region.
> +	 */
> +	bank_addr = r5_core->tcm_banks[0]->addr;
> +	bank_name = r5_core->tcm_banks[0]->bank_name;
> +
> +	for (i = 0; i < num_banks; i++) {
> +		bank_size += r5_core->tcm_banks[i]->size;
> +		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
> +
> +		/* Turn on each TCM bank individually */
> +		ret = zynqmp_pm_request_node(pm_domain_id,
> +					     ZYNQMP_PM_CAPABILITY_ACCESS, 0,
> +					     ZYNQMP_PM_REQUEST_ACK_BLOCKING);
> +		if (ret < 0) {
> +			dev_err(dev, "failed to turn on TCM 0x%x", pm_domain_id);
> +			goto release_tcm_lockstep;
> +		}
> +	}
> +
> +	dev_dbg(dev, "TCM add carveout lockstep mode %s addr=0x%llx, size=0x%lx",
> +		bank_name, bank_addr, bank_size);
> +
> +	/* Register TCM address range, TCM map and unmap functions */
> +	rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr,
> +					 bank_size, bank_addr,
> +					 tcm_mem_map, tcm_mem_unmap,
> +					 bank_name);
> +	if (!rproc_mem) {
> +		ret = -ENOMEM;
> +		goto release_tcm_lockstep;
> +	}
> +
> +	/* If registration is success, add carveouts */
> +	rproc_add_carveout(rproc, rproc_mem);
> +
> +	return 0;
> +
> +release_tcm_lockstep:
> +	/* If failed, Turn off all TCM banks turned on before */
> +	for (i--; i >= 0; i--) {
> +		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
> +		zynqmp_pm_release_node(pm_domain_id);
> +	}
> +	return ret;
> +}
> +
> +/*
> + * add_tcm_banks()
> + * @rproc: single R5 core's corresponding rproc instance
> + *
> + * allocate and add remoteproc carveouts for TCM memory based on cluster mode
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int add_tcm_banks(struct rproc *rproc)
> +{
> +	struct zynqmp_r5_cluster *cluster;
> +	struct zynqmp_r5_core *r5_core;
> +	struct device *dev;
> +
> +	r5_core = (struct zynqmp_r5_core *)rproc->priv;
> +	if (!r5_core)
> +		return -EINVAL;
> +
> +	dev = r5_core->dev;
> +
> +	cluster = dev_get_drvdata(dev->parent);
> +	if (!cluster) {
> +		dev_err(dev->parent, "Invalid driver data\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * In lockstep mode TCM banks are one contiguous memory region of 256Kb
> +	 * In split mode, each TCM bank is 64Kb and not contiguous.
> +	 * We add memory carveouts accordingly.
> +	 */
> +	if (cluster->mode == SPLIT_MODE)
> +		return add_tcm_carveout_split_mode(rproc);
> +	else if (cluster->mode == LOCKSTEP_MODE)
> +		return add_tcm_carveout_lockstep_mode(rproc);
> +
> +	return -EINVAL;
> +}
> +
> +/*
> + * zynqmp_r5_parse_fw()
> + * @rproc: single R5 core's corresponding rproc instance
> + * @fw: ptr to firmware to be loaded onto r5 core
> + *
> + * get resource table if available
> + *
> + * return 0 on success, otherwise non-zero value on failure
> + */
> +static int zynqmp_r5_parse_fw(struct rproc *rproc, const struct firmware *fw)
> +{
> +	int ret;
> +
> +	ret = rproc_elf_load_rsc_table(rproc, fw);
> +	if (ret == -EINVAL) {
> +		/*
> +		 * resource table only required for IPC.
> +		 * if not present, this is not necessarily an error;
> +		 * for example, loading r5 hello world application
> +		 * so simply inform user and keep going.
> +		 */
> +		dev_info(&rproc->dev, "no resource table found.\n");
> +		ret = 0;
> +	}
> +	return ret;
> +}
> +
> +/**
> + * zynqmp_r5_rproc_prepare()
> + * adds carveouts for TCM bank and reserved memory regions
> + *
> + * @rproc: Device node of each rproc
> + *
> + * Return: 0 for success else < 0 error code
> + */
> +static int zynqmp_r5_rproc_prepare(struct rproc *rproc)
> +{
> +	int ret;
> +
> +	ret = add_tcm_banks(rproc);
> +	if (ret) {
> +		dev_err(&rproc->dev, "failed to get TCM banks, err %d\n", ret);
> +		return ret;
> +	}
> +
> +	ret = add_mem_regions_carveout(rproc);
> +	if (ret) {
> +		dev_err(&rproc->dev, "failed to get reserve mem regions %d\n", ret);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * zynqmp_r5_rproc_unprepare()
> + * Turns off TCM banks using power-domain id
> + *
> + * @rproc: Device node of each rproc
> + *
> + * Return: always 0
> + */
> +static int zynqmp_r5_rproc_unprepare(struct rproc *rproc)
> +{
> +	struct zynqmp_r5_core *r5_core;
> +	u32 pm_domain_id;
> +	int i;
> +
> +	r5_core = (struct zynqmp_r5_core *)rproc->priv;
> +
> +	for (i = 0; i < r5_core->tcm_bank_count; i++) {
> +		pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id;
> +		if (zynqmp_pm_release_node(pm_domain_id))
> +			dev_warn(r5_core->dev,
> +				 "can't turn off TCM bank 0x%x", pm_domain_id);
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct rproc_ops zynqmp_r5_rproc_ops = {
> +	.prepare	= zynqmp_r5_rproc_prepare,
> +	.unprepare	= zynqmp_r5_rproc_unprepare,
> +	.start		= zynqmp_r5_rproc_start,
> +	.stop		= zynqmp_r5_rproc_stop,
> +	.load		= rproc_elf_load_segments,
> +	.parse_fw	= zynqmp_r5_parse_fw,
> +	.find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table,
> +	.sanity_check	= rproc_elf_sanity_check,
> +	.get_boot_addr	= rproc_elf_get_boot_addr,
> +};
> +
> +/**
> + * zynqmp_r5_add_rproc_core()
> + * Allocate and add struct rproc object for each r5f core
> + * This is called for each individual r5f core
> + *
> + * @cdev: Device node of each r5 core
> + *
> + * Return: zynqmp_r5_core object for success else error code pointer
> + */
> +static struct zynqmp_r5_core *zynqmp_r5_add_rproc_core(struct device *cdev)
> +{
> +	struct zynqmp_r5_core *r5_core;
> +	struct rproc *r5_rproc;
> +	int ret;
> +
> +	/* Set up DMA mask */
> +	ret = dma_set_coherent_mask(cdev, DMA_BIT_MASK(32));
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	/* Allocate remoteproc instance */
> +	r5_rproc = rproc_alloc(cdev, dev_name(cdev),
> +			       &zynqmp_r5_rproc_ops,
> +			       NULL, sizeof(struct zynqmp_r5_core));
> +	if (!r5_rproc) {
> +		dev_err(cdev, "failed to allocate memory for rproc instance\n");
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	r5_rproc->auto_boot = false;
> +	r5_core = (struct zynqmp_r5_core *)r5_rproc->priv;
> +	r5_core->dev = cdev;
> +	r5_core->np = dev_of_node(cdev);
> +	if (!r5_core->np) {
> +		dev_err(cdev, "can't get device node for r5 core\n");
> +		ret = -EINVAL;
> +		goto free_rproc;
> +	}
> +
> +	/* Add R5 remoteproc core */
> +	ret = rproc_add(r5_rproc);
> +	if (ret) {
> +		dev_err(cdev, "failed to add r5 remoteproc\n");
> +		goto free_rproc;
> +	}
> +
> +	r5_core->rproc = r5_rproc;
> +	return r5_core;
> +
> +free_rproc:
> +	rproc_free(r5_rproc);
> +	return ERR_PTR(ret);
> +}
> +
> +/**
> + * zynqmp_r5_get_tcm_node()
> + * Ideally this function should parse tcm node and store information
> + * in r5_core instance. For now, Hardcoded TCM information is used.
> + * This approach is used as TCM bindings for system-dt is being developed
> + *
> + * @cluster: pointer to zynqmp_r5_cluster type object
> + *
> + * Return: 0 for success and < 0 error code for failure.
> + */
> +static int zynqmp_r5_get_tcm_node(struct zynqmp_r5_cluster *cluster)
> +{
> +	struct device *dev = cluster->dev;
> +	struct zynqmp_r5_core *r5_core;
> +	int tcm_bank_count, tcm_node;
> +	int i, j;
> +
> +	tcm_bank_count = ARRAY_SIZE(zynqmp_tcm_banks);
> +
> +	/* count per core tcm banks */
> +	tcm_bank_count = tcm_bank_count / cluster->core_count;
> +
> +	/*
> +	 * r5 core 0 will use all of TCM banks in lockstep mode.
> +	 * In split mode, r5 core0 will use 128k and r5 core1 will use another
> +	 * 128k. Assign TCM banks to each core accordingly
> +	 */
> +	tcm_node = 0;
> +	for (i = 0; i < cluster->core_count; i++) {
> +		r5_core = cluster->r5_cores[i];
> +		r5_core->tcm_banks = devm_kcalloc(dev, tcm_bank_count,
> +						  sizeof(struct mem_bank_data *),
> +						  GFP_KERNEL);
> +		if (!r5_core->tcm_banks)
> +			return -ENOMEM;
> +
> +		for (j = 0; j < tcm_bank_count; j++) {
> +			/*
> +			 * Use pre-defined TCM reg values.
> +			 * Eventually this should be replaced by values
> +			 * parsed from dts.
> +			 */
> +			r5_core->tcm_banks[j] =
> +				(struct mem_bank_data *)&zynqmp_tcm_banks[tcm_node];
> +			tcm_node++;
> +		}
> +
> +		r5_core->tcm_bank_count = tcm_bank_count;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * zynqmp_r5_get_mem_region_node()
> + * parse memory-region property and get reserved mem regions
> + *
> + * @r5_core: pointer to zynqmp_r5_core type object
> + *
> + * Return: 0 for success and error code for failure.
> + */
> +static int zynqmp_r5_get_mem_region_node(struct zynqmp_r5_core *r5_core)
> +{
> +	struct device_node *np, *rmem_np;
> +	struct reserved_mem **rmem;
> +	int res_mem_count, i;
> +	struct device *dev;
> +
> +	dev = r5_core->dev;
> +	np = r5_core->np;
> +
> +	res_mem_count = of_property_count_elems_of_size(np, "memory-region",
> +							sizeof(phandle));
> +	if (res_mem_count <= 0) {
> +		dev_warn(dev, "failed to get memory-region property %d\n",
> +			 res_mem_count);
> +		return 0;
> +	}
> +
> +	rmem = devm_kcalloc(dev, res_mem_count,
> +			    sizeof(struct reserved_mem *), GFP_KERNEL);
> +	if (!rmem)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < res_mem_count; i++) {
> +		rmem_np = of_parse_phandle(np, "memory-region", i);
> +		if (!rmem_np)
> +			goto release_rmem;
> +
> +		rmem[i] = of_reserved_mem_lookup(rmem_np);

Function of_reserved_mem_device_release() is never called for these. 

> +		if (!rmem[i]) {
> +			of_node_put(rmem_np);
> +			goto release_rmem;
> +		}
> +
> +		of_node_put(rmem_np);
> +	}
> +
> +	r5_core->rmem_count = res_mem_count;
> +	r5_core->rmem = rmem;
> +	return 0;
> +
> +release_rmem:
> +	devm_kfree(dev, rmem);

Not sure why this is needed when devm_kcalloc() is used...

> +	return -EINVAL;
> +}
> +
> +/*
> + * zynqmp_r5_core_init()
> + * Create and initialize zynqmp_r5_core type object
> + *
> + * @cluster: pointer to zynqmp_r5_cluster type object
> + * @fw_reg_val: value expected by firmware to configure RPU cluster mode
> + * @tcm_mode: value expected by fw to configure TCM mode (lockstep or split)
> + *
> + * Return: 0 for success and error code for failure.
> + */
> +static int zynqmp_r5_core_init(struct zynqmp_r5_cluster *cluster,
> +			       enum rpu_oper_mode fw_reg_val,
> +			       enum rpu_tcm_comb tcm_mode)
> +{
> +	struct device *dev = cluster->dev;
> +	struct zynqmp_r5_core *r5_core;
> +	int ret, i;
> +
> +	ret = zynqmp_r5_get_tcm_node(cluster);
> +	if (ret < 0) {
> +		dev_err(dev, "can't get tcm node, err %d\n", ret);
> +		return ret;
> +	}
> +
> +	for (i = 0; i < cluster->core_count; i++) {
> +		r5_core = cluster->r5_cores[i];
> +
> +		ret = zynqmp_r5_get_mem_region_node(r5_core);
> +		if (ret)
> +			dev_warn(dev, "memory-region prop failed %d\n", ret);
> +
> +		/* Initialize r5 cores with power-domains parsed from dts */
> +		ret = of_property_read_u32_index(r5_core->np, "power-domains",
> +						 1, &r5_core->pm_domain_id);
> +		if (ret) {
> +			dev_err(dev, "failed to get power-domains property\n");
> +			return ret;
> +		}
> +
> +		ret = zynqmp_r5_set_mode(r5_core, fw_reg_val, tcm_mode);
> +		if (ret) {
> +			dev_err(dev, "failed to set r5 cluster mode %d, err %d\n",
> +				cluster->mode, ret);
> +			return ret;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * zynqmp_r5_cluster_init()
> + * Create and initialize zynqmp_r5_cluster type object
> + *
> + * @cluster: pointer to zynqmp_r5_cluster type object
> + *
> + * Return: 0 for success and error code for failure.
> + */
> +static int zynqmp_r5_cluster_init(struct zynqmp_r5_cluster *cluster)
> +{
> +	enum zynqmp_r5_cluster_mode cluster_mode = LOCKSTEP_MODE;
> +	struct device *dev = cluster->dev;
> +	struct device_node *dev_node = dev_of_node(dev);
> +	struct platform_device *child_pdev;
> +	struct zynqmp_r5_core **r5_cores;
> +	enum rpu_oper_mode fw_reg_val;
> +	struct device **child_devs;
> +	struct device_node *child;
> +	enum rpu_tcm_comb tcm_mode;
> +	int core_count, ret, i;
> +
> +	ret = of_property_read_u32(dev_node, "xlnx,cluster-mode", &cluster_mode);
> +
> +	/*
> +	 * on success returns 0, if not defined then returns -EINVAL,
> +	 * In that case, default is LOCKSTEP mode. Other than that
> +	 * returns relative error code < 0.
> +	 */
> +	if (ret != -EINVAL && ret != 0) {
> +		dev_err(dev, "Invalid xlnx,cluster-mode property\n");
> +		return -EINVAL;

Why not return 'ret' rather than masking the real error code?

> +	}
> +
> +	/*
> +	 * For now driver only supports split mode and lockstep mode.
> +	 * fail driver probe if either of that is not set in dts.
> +	 */
> +	if (cluster_mode == LOCKSTEP_MODE) {
> +		tcm_mode = PM_RPU_TCM_COMB;
> +		fw_reg_val = PM_RPU_MODE_LOCKSTEP;
> +	} else if (cluster_mode == SPLIT_MODE) {
> +		tcm_mode = PM_RPU_TCM_SPLIT;
> +		fw_reg_val = PM_RPU_MODE_SPLIT;
> +	} else {
> +		dev_err(dev, "driver does not support cluster mode %d\n", cluster_mode);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Number of cores is decided by number of child nodes of
> +	 * r5f subsystem node in dts. If Split mode is used in dts
> +	 * 2 child nodes are expected.
> +	 * In lockstep mode if two child nodes are available,
> +	 * only use first child node and consider it as core0
> +	 * and ignore core1 dt node.
> +	 */
> +	core_count = of_get_available_child_count(dev_node);
> +	if (core_count <= 0) {

Function of_get_available_child_count() does not return a negative value.

More comments to come tomorrow or later this week.

Thanks,
Mathieu


> +		dev_err(dev, "Invalid number of r5 cores %d", core_count);
> +		return -EINVAL;
> +	} else if (cluster_mode == SPLIT_MODE && core_count != 2) {
> +		dev_err(dev, "Invalid number of r5 cores for split mode\n");
> +		return -EINVAL;
> +	} else if (cluster_mode == LOCKSTEP_MODE && core_count == 2) {
> +		dev_warn(dev, "Only r5 core0 will be used\n");
> +		core_count = 1;
> +	}
> +
> +	child_devs = kcalloc(core_count, sizeof(struct device *), GFP_KERNEL);
> +	if (!child_devs)
> +		return -ENOMEM;
> +
> +	r5_cores = kcalloc(core_count,
> +			   sizeof(struct zynqmp_r5_core *), GFP_KERNEL);
> +	if (!r5_cores) {
> +		kfree(child_devs);
> +		return -ENOMEM;
> +	}
> +
> +	i = 0;
> +	for_each_available_child_of_node(dev_node, child) {
> +		child_pdev = of_find_device_by_node(child);
> +		if (!child_pdev) {
> +			of_node_put(child);
> +			ret = -ENODEV;
> +			goto release_r5_cores;
> +		}
> +
> +		child_devs[i] = &child_pdev->dev;
> +
> +		/* create and add remoteproc instance of type struct rproc */
> +		r5_cores[i] = zynqmp_r5_add_rproc_core(&child_pdev->dev);
> +		if (IS_ERR(r5_cores[i])) {
> +			of_node_put(child);
> +			ret = PTR_ERR(r5_cores[i]);
> +			r5_cores[i] = NULL;
> +			goto release_r5_cores;
> +		}
> +
> +		/*
> +		 * If two child nodes are available in dts in lockstep mode,
> +		 * then ignore second child node.
> +		 */
> +		if (cluster_mode == LOCKSTEP_MODE) {
> +			of_node_put(child);
> +			break;
> +		}
> +
> +		i++;
> +	}
> +
> +	cluster->mode = cluster_mode;
> +	cluster->core_count = core_count;
> +	cluster->r5_cores = r5_cores;
> +
> +	ret = zynqmp_r5_core_init(cluster, fw_reg_val, tcm_mode);
> +	if (ret < 0) {
> +		dev_err(dev, "failed to init r5 core err %d\n", ret);
> +		cluster->core_count = 0;
> +		cluster->r5_cores = NULL;
> +
> +		/*
> +		 * at this point rproc resources for each core are allocated.
> +		 * adjust index to free resources in reverse order
> +		 */
> +		i = core_count - 1;
> +		goto release_r5_cores;
> +	}
> +
> +	kfree(child_devs);
> +	return 0;
> +
> +release_r5_cores:
> +	while (i >= 0) {
> +		put_device(child_devs[i]);
> +		if (r5_cores[i]) {
> +			rproc_del(r5_cores[i]->rproc);
> +			rproc_free(r5_cores[i]->rproc);
> +		}
> +		i--;
> +	}
> +	kfree(r5_cores);
> +	kfree(child_devs);
> +	return ret;
> +}
> +
> +static void zynqmp_r5_cluster_exit(void *data)
> +{
> +	struct platform_device *pdev = (struct platform_device *)data;
> +	struct zynqmp_r5_cluster *cluster;
> +	struct zynqmp_r5_core *r5_core;
> +	int i;
> +
> +	cluster = (struct zynqmp_r5_cluster *)platform_get_drvdata(pdev);
> +	if (!cluster)
> +		return;
> +
> +	for (i = 0; i < cluster->core_count; i++) {
> +		r5_core = cluster->r5_cores[i];
> +		put_device(r5_core->dev);
> +		rproc_del(r5_core->rproc);
> +		rproc_free(r5_core->rproc);
> +	}
> +
> +	kfree(cluster->r5_cores);
> +	kfree(cluster);
> +	platform_set_drvdata(pdev, NULL);
> +}
> +
> +/*
> + * zynqmp_r5_remoteproc_probe()
> + * parse device-tree, initialize hardware and allocate required resources
> + * and remoteproc ops
> + *
> + * @pdev: domain platform device for R5 cluster
> + *
> + * Return: 0 for success and < 0 for failure.
> + */
> +static int zynqmp_r5_remoteproc_probe(struct platform_device *pdev)
> +{
> +	struct zynqmp_r5_cluster *cluster;
> +	struct device *dev = &pdev->dev;
> +	int ret;
> +
> +	cluster = kzalloc(sizeof(*cluster), GFP_KERNEL);
> +	if (!cluster)
> +		return -ENOMEM;
> +
> +	cluster->dev = dev;
> +
> +	ret = devm_of_platform_populate(dev);
> +	if (ret) {
> +		dev_err_probe(dev, ret, "failed to populate platform dev\n");
> +		kfree(cluster);
> +		return ret;
> +	}
> +
> +	/* wire in so each core can be cleaned up at driver remove */
> +	platform_set_drvdata(pdev, cluster);
> +
> +	ret = zynqmp_r5_cluster_init(cluster);
> +	if (ret) {
> +		kfree(cluster);
> +		platform_set_drvdata(pdev, NULL);
> +		dev_err_probe(dev, ret, "Invalid r5f subsystem device tree\n");
> +		return ret;
> +	}
> +
> +	ret = devm_add_action_or_reset(dev, zynqmp_r5_cluster_exit, pdev);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
> +/* Match table for OF platform binding */
> +static const struct of_device_id zynqmp_r5_remoteproc_match[] = {
> +	{ .compatible = "xlnx,zynqmp-r5fss", },
> +	{ /* end of list */ },
> +};
> +MODULE_DEVICE_TABLE(of, zynqmp_r5_remoteproc_match);
> +
> +static struct platform_driver zynqmp_r5_remoteproc_driver = {
> +	.probe = zynqmp_r5_remoteproc_probe,
> +	.driver = {
> +		.name = "zynqmp_r5_remoteproc",
> +		.of_match_table = zynqmp_r5_remoteproc_match,
> +	},
> +};
> +module_platform_driver(zynqmp_r5_remoteproc_driver);
> +
> +MODULE_DESCRIPTION("Xilinx R5F remote processor driver");
> +MODULE_AUTHOR("Xilinx Inc.");
> +MODULE_LICENSE("GPL");
> -- 
> 2.25.1
> 



More information about the linux-arm-kernel mailing list