[PATCH] remoteproc: imx_dsp_rproc: add custom memory copy implementation for i.MX DSP Cores
Iuliana Prodan
iuliana.prodan at nxp.com
Thu Jan 26 16:16:16 PST 2023
Hi Mathieu,
On 1/27/2023 12:49 AM, Mathieu Poirier wrote:
> On Wed, Jan 25, 2023 at 01:01:00PM +0200, Iuliana Prodan (OSS) wrote:
>> From: Iuliana Prodan <iuliana.prodan at nxp.com>
>>
>> The IRAM is part of the HiFi DSP.
>> According to hardware specification only 32-bits write are allowed
>> otherwise we get a Kernel panic.
>>
>> Therefore add a custom memory copy function to deal with the
>> above restriction.
>>
>> Signed-off-by: Iuliana Prodan <iuliana.prodan at nxp.com>
>> ---
>> drivers/remoteproc/imx_dsp_rproc.c | 122 ++++++++++++++++++++++++++++-
>> 1 file changed, 121 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/remoteproc/imx_dsp_rproc.c b/drivers/remoteproc/imx_dsp_rproc.c
>> index 95da1cbefacf..a9991d085494 100644
>> --- a/drivers/remoteproc/imx_dsp_rproc.c
>> +++ b/drivers/remoteproc/imx_dsp_rproc.c
>> @@ -715,6 +715,126 @@ static void imx_dsp_rproc_kick(struct rproc *rproc, int vqid)
>> dev_err(dev, "%s: failed (%d, err:%d)\n", __func__, vqid, err);
>> }
>>
>> +/*
>> + * Custom memory copy implementation for i.MX DSP Cores
>> + *
>> + * The IRAM is part of the HiFi DSP.
>> + * According to hw specs only 32-bits writes are allowed.
>> + */
>> +static int imx_dsp_rproc_memcpy(void *dest, const void *src, size_t size)
>> +{
>> + const u8 *src_byte = src;
>> + u32 affected_mask;
>> + u32 tmp;
>> + int q, r;
>> +
>> + q = size / 4;
>> + r = size % 4;
>> +
>> + /* __iowrite32_copy use 32bit size values so divide by 4 */
>> + __iowrite32_copy(dest, src, q);
> The current driver for imx_dsp_rproc does not provide a rproc_da_to_va()
> operation, meaning that @is_iomem in rproc_elf_load_segments() can't be true,
> forcing a memcpy() operation to be used. And yet above an _iowrite32_copy() is
> used...
Yes, with rproc_elf_load_segments() we go through memcpy() because
io_mem is always false (I already have a patch to get rid of this flag
from imx_dsp_rproc since is not used), but memcpy() vs
__iowrite32_copy() is crashing on sizes that are not multiple of 32bit.
That's why I added the __iowrite32_copy() and above this, I'm dividing
the size by 4.
Also, in imx_dsp_rproc, we shouldn't use memcpy, it should be
memcpy_toio because all addresses are ioremap - see
https://elixir.bootlin.com/linux/v6.2-rc5/source/drivers/remoteproc/imx_dsp_rproc.c#L601
> In the conversation that came out of[1], Daniel Baluta mentions that
> read/writes should be done in multiples of 32/64 bit but here a blanket 32 bit
> enforcement is done.
I've checked deeper the documentation and talked to our hardware team,
and for NXP's DSPs we have a write restriction of 32bit.
>> +
>> + if (r) {
>> + affected_mask = (1 << (8 * r)) - 1;
>> +
>> + /* first read the 32bit data of dest, then change affected
>> + * bytes, and write back to dest.
>> + * For unaffected bytes, it should not be changed
>> + */
>> + tmp = ioread32(dest + q * 4);
>> + tmp &= ~affected_mask;
>> +
>> + tmp |= *(u32 *)(src_byte + q * 4) & affected_mask;
>> + iowrite32(tmp, dest + q * 4);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/**
>> + * imx_dsp_rproc_elf_load_segments() - load firmware segments to memory
>> + * @rproc: remote processor which will be booted using these fw segments
>> + * @fw: the ELF firmware image
>> + *
>> + * This function loads the firmware segments to memory, where the remote
>> + * processor expects them.
>> + *
>> + * Return: 0 on success and an appropriate error code otherwise
>> + */
>> +static int imx_dsp_rproc_elf_load_segments(struct rproc *rproc, const struct firmware *fw)
>> +{
>> + struct device *dev = &rproc->dev;
>> + const void *ehdr, *phdr;
>> + int i, ret = 0;
>> + u16 phnum;
>> + const u8 *elf_data = fw->data;
>> + u8 class = fw_elf_get_class(fw);
>> + u32 elf_phdr_get_size = elf_size_of_phdr(class);
>> +
>> + ehdr = elf_data;
>> + phnum = elf_hdr_get_e_phnum(class, ehdr);
>> + phdr = elf_data + elf_hdr_get_e_phoff(class, ehdr);
>> +
>> + /* go through the available ELF segments */
>> + for (i = 0; i < phnum; i++, phdr += elf_phdr_get_size) {
>> + u64 da = elf_phdr_get_p_paddr(class, phdr);
>> + u64 memsz = elf_phdr_get_p_memsz(class, phdr);
>> + u64 filesz = elf_phdr_get_p_filesz(class, phdr);
>> + u64 offset = elf_phdr_get_p_offset(class, phdr);
>> + u32 type = elf_phdr_get_p_type(class, phdr);
>> + bool is_iomem = false;
>> + void *ptr;
>> +
>> + if (type != PT_LOAD || !memsz)
>> + continue;
>> +
>> + dev_dbg(dev, "phdr: type %d da 0x%llx memsz 0x%llx filesz 0x%llx\n",
>> + type, da, memsz, filesz);
>> +
>> + if (filesz > memsz) {
>> + dev_err(dev, "bad phdr filesz 0x%llx memsz 0x%llx\n",
>> + filesz, memsz);
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + if (offset + filesz > fw->size) {
>> + dev_err(dev, "truncated fw: need 0x%llx avail 0x%zx\n",
>> + offset + filesz, fw->size);
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + if (!rproc_u64_fit_in_size_t(memsz)) {
>> + dev_err(dev, "size (%llx) does not fit in size_t type\n",
>> + memsz);
>> + ret = -EOVERFLOW;
>> + break;
>> + }
>> +
>> + /* grab the kernel address for this device address */
>> + ptr = rproc_da_to_va(rproc, da, memsz, &is_iomem);
>> + if (!ptr) {
>> + dev_err(dev, "bad phdr da 0x%llx mem 0x%llx\n", da,
>> + memsz);
>> + ret = -EINVAL;
>> + break;
>> + }
>> +
>> + /* put the segment where the remote processor expects it */
>> + if (filesz) {
>> + ret = imx_dsp_rproc_memcpy(ptr, elf_data + offset, filesz);
>> + if (ret) {
>> + dev_err(dev, "memory copy failed for da 0x%llx memsz 0x%llx\n",
>> + da, memsz);
>> + break;
>> + }
>> + }
> This patchset from last year[1] goes to great length to avoid using a driver
> specific function and now you are trying to bring that back... So how was it
> working before and why are things broken now?
Until now, it was used in a limited scenario and the firmware was
correctly built to respect the write restriction - having the IRAM
sections size a multiple of 4bytes.
Now, I was trying a simple hello_world sample from Zephyr, complied with
gcc and I crashed the Kernel trying to load it on the hifi4 DSP:
[ 2707.135094] SError Interrupt on CPU0, code 0x00000000bf000002 -- SError
[ 2707.135104] CPU: 0 PID: 665 Comm: sh Tainted: G C
6.1.0-rc6-04789-gc80e5155d190 #135
[ 2707.135112] Hardware name: Freescale i.MX8QM MEK (DT)
[ 2707.135115] pstate: 400000c5 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[ 2707.135123] pc : vprintk_store+0x3c0/0x460
[ 2707.135141] lr : vprintk_store+0x48/0x460
[ 2707.135149] sp : ffff80000b31b8c0
[ 2707.135152] x29: ffff80000b31b8c0 x28: 0000000000000018 x27:
0000000000000000
[ 2707.135164] x26: 00000000596f8000 x25: ffff000810c15038 x24:
0000000000000000
[ 2707.135173] x23: ffff8000098dfaf8 x22: 0000000000000000 x21:
0000000000000000
[ 2707.135181] x20: ffff80000b31ba30 x19: ffff800009abca31 x18:
ffffffffffffffff
[ 2707.135190] x17: 6620313231783020 x16: 7a736d656d203030 x15:
ffff80008b31b867
[ 2707.135199] x14: 0000000000000000 x13: ffff800009d22428 x12:
000000000000083a
[ 2707.135208] x11: 00000000000002be x10: 0000000000000a40 x9 :
ffff80000b31bb70
[ 2707.135216] x8 : ffff80000b31bb70 x7 : 00000000ffffffc8 x6 :
ffff80000b31bb30
[ 2707.135224] x5 : ffff00081a098000 x4 : ffff80000b31ba30 x3 :
ffff8000098dfaf8
[ 2707.135233] x2 : ffff80000990c000 x1 : 0000000000000000 x0 :
ffff8008eface000
[ 2707.135243] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 2707.135248] CPU: 0 PID: 665 Comm: sh Tainted: G C
6.1.0-rc6-04789-gc80e5155d190 #135
[ 2707.135254] Hardware name: Freescale i.MX8QM MEK (DT)
[ 2707.135258] Call trace:
[ 2707.135261] dump_backtrace.part.0+0xdc/0xf0
[ 2707.135275] show_stack+0x18/0x40
[ 2707.135284] dump_stack_lvl+0x68/0x84
[ 2707.135295] dump_stack+0x18/0x34
[ 2707.135302] panic+0x184/0x344
[ 2707.135311] nmi_panic+0xac/0xb0
[ 2707.135319] arm64_serror_panic+0x6c/0x7c
[ 2707.135324] do_serror+0x58/0x5c
[ 2707.135329] el1h_64_error_handler+0x30/0x4c
[ 2707.135339] el1h_64_error+0x64/0x68
[ 2707.135344] vprintk_store+0x3c0/0x460
[ 2707.135353] vprintk_emit+0x104/0x294
[ 2707.135360] vprintk_default+0x38/0x4c
[ 2707.135369] vprintk+0xc0/0xe4
[ 2707.135376] _printk+0x5c/0x84
[ 2707.135383] rproc_elf_load_segments+0x228/0x308
[ 2707.135391] rproc_start+0x50/0x1c8
[ 2707.135398] rproc_boot+0x494/0x574
[ 2707.135404] state_store+0x44/0x110
[ 2707.135413] dev_attr_store+0x18/0x30
[ 2707.135422] sysfs_kf_write+0x44/0x54
[ 2707.135433] kernfs_fop_write_iter+0x118/0x1b0
[ 2707.135441] vfs_write+0x220/0x2b0
[ 2707.135449] ksys_write+0x68/0xf4
[ 2707.135455] __arm64_sys_write+0x1c/0x2c
[ 2707.135461] invoke_syscall+0x48/0x114
[ 2707.135470] el0_svc_common.constprop.0+0xd4/0xfc
[ 2707.135477] do_el0_svc+0x30/0xd0
[ 2707.135484] el0_svc+0x2c/0x84
[ 2707.135492] el0t_64_sync_handler+0xbc/0x140
[ 2707.135501] el0t_64_sync+0x18c/0x190
[ 2707.135510] SMP: stopping secondary CPUs
[ 2707.135520] Kernel Offset: disabled
[ 2707.135522] CPU features: 0x20000,2082c084,0000421b
[ 2707.135527] Memory Limit: none
[ 2707.397688] ---[ end Kernel panic - not syncing: Asynchronous SError
Interrupt ]---
> Moreover, function
> rproc_elf_load_segments() deals with situations where the memory slot is bigger
> than the file size[2], which is omitted here.
I'll add this part in v2.
Thanks,
Iulia
> Thanks,
> Mathieu
>
> [1]. https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flinux-arm-kernel%2F20220323064944.1351923-1-peng.fan%40oss.nxp.com%2F&data=05%7C01%7Ciuliana.prodan%40nxp.com%7Cd946b7c87aaf4016c69908daffef8a64%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638103701532520716%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GNJhJnoiRWfAzvseUsAdwHRtY2w2mA816CKkTL%2F2czw%3D&reserved=0
> [2]. https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.2-rc5%2Fsource%2Fdrivers%2Fremoteproc%2Fremoteproc_elf_loader.c%23L221&data=05%7C01%7Ciuliana.prodan%40nxp.com%7Cd946b7c87aaf4016c69908daffef8a64%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638103701532520716%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=5P5FcuMb6H%2B5S4L2Il4BBAxOOIrr6Er5thNvhADhKdc%3D&reserved=0
>
More information about the linux-arm-kernel
mailing list