[PATCH v3] remoteproc: imx_dsp_rproc: Add support of recovery and coredump process
Iuliana Prodan
iuliana.prodan at nxp.com
Mon Jul 28 08:09:02 PDT 2025
On 7/28/2025 5:14 PM, Mathieu Poirier wrote:
> On Mon, Jul 28, 2025 at 01:39:38PM +0300, Daniel Baluta wrote:
>> On Tue, Jul 22, 2025 at 11:16 AM Shengjiu Wang <shengjiu.wang at nxp.com> wrote:
>>>
>>> When enabled FW recovery, but is broken because software reset is missed
>>> in this recovery flow. So move software reset from
>>> imx_dsp_runtime_resume() to .load() and clear memory before loading
>>> firmware to make recovery work.
>>>
>>> Add call rproc_coredump_set_elf_info() to initialize the elf info for
>>> coredump, otherwise coredump will report error "ELF class is not set".
>>>
>>> Fixes: ec0e5549f358 ("remoteproc: imx_dsp_rproc: Add remoteproc driver for DSP on i.MX")
>>> Signed-off-by: Shengjiu Wang <shengjiu.wang at nxp.com>
>>
>> Changes looks good to me:
I agree, but this is not enough.
>>
>> Reviewed-by: Daniel Baluta <daniel.baluta at nxp.com>
>>
>> I've tested it with Zephyr synchronization samples inducing a crash
>> via debugfs interface. App
>> can recover correctly.
The synchronization sample does not utilize the Messaging Unit (MU) for
communication between the two cores, its behavior is similar to the
basic hello_world example (no fw_ready reply is expected by host).
I’ve tested this patch with both the synchronization and hello_world
samples, as well as with the default firmware specified in the device
tree (imx/dsp/hifi4.bin), and everything works as expected.
However, when testing with the openamp_rsc_table sample from Zephyr [1],
I encountered the following issue:
```
[ 1500.964232] remoteproc remoteproc0: crash detected in imx-dsp-rproc:
type watchdog
[ 1500.964595] remoteproc remoteproc0: handling crash #1 in imx-dsp-rproc
[ 1500.964608] remoteproc remoteproc0: recovering imx-dsp-rproc
[ 1500.965959] remoteproc remoteproc0: stopped remote processor
imx-dsp-rproc
[ 1501.251897] remoteproc remoteproc0: can't start rproc imx-dsp-rproc: -110
```
Upon debugging, I discovered that the issue stems from the imx-mailbox
driver not clearing the General Purpose Interrupt (GPI) bits. This leads
to the remote processor failing to restart properly.
To ensure compatibility across all firmware variants, including those
using OpenAMP, the attached patch is required. Both the recovery and
mailbox patches have been successfully tested on the following
platforms: i.MX8MP, i.MX8ULP, i.MX8QM and i.MX8QXP.
Shengjiu, do you want to send a new version with both patches?
Thanks,
Iulia
>
> Very good - I will merge this around 6.17-rc2 when I get back from vacation.
>
> Mathieu
>
[1]
https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/subsys/ipc/openamp_rsc_table
-------------- next part --------------
From 47786070f1ffbd73f4ff0009e2dbddc79d607e86 Mon Sep 17 00:00:00 2001
From: Iuliana Prodan <iuliana.prodan at nxp.com>
Date: Mon, 28 Jul 2025 15:21:24 +0300
Subject: [PATCH 4/4] mailbox: imx: Clear pending bits for the GPIs that are
not enabled
Enhance the i.MX Messaging Unit interrupt service routine
to properly handle general-purpose interrupts (GIP) that
are pending but have their corresponding enable bits (GIEn)
cleared.
This ensures that we can notify the host - such as sending
a fw_ready reply from the DSP remote core - on the second
or any subsequent startup.
Signed-off-by: Iuliana Prodan <iuliana.prodan at nxp.com>
---
drivers/mailbox/imx-mailbox.c | 31 +++++++++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)
diff --git a/drivers/mailbox/imx-mailbox.c b/drivers/mailbox/imx-mailbox.c
index 6b9dbd6a337a..2d1d81545673 100644
--- a/drivers/mailbox/imx-mailbox.c
+++ b/drivers/mailbox/imx-mailbox.c
@@ -40,6 +40,9 @@
#define IMX_MU_SECO_TX_TOUT (msecs_to_jiffies(3000))
#define IMX_MU_SECO_RX_TOUT (msecs_to_jiffies(3000))
+/* 4 general-purpose interrupt requests reflected to the other side */
+#define IMX_MU_GIP_NO 4
+
/* Please not change TX & RX */
enum imx_mu_chan_type {
IMX_MU_TYPE_TX = 0, /* Tx */
@@ -143,7 +146,7 @@ struct imx_mu_dcfg {
/* MU reset */
#define IMX_MU_xCR_RST(type) (type & IMX_MU_V2 ? BIT(0) : BIT(5))
#define IMX_MU_xSR_RST(type) (type & IMX_MU_V2 ? BIT(0) : BIT(7))
-
+#define IMX_MU_xSR_BRDIP(type) (type & IMX_MU_V2 ? BIT(0) : BIT(9))
static struct imx_mu_priv *to_imx_mu_priv(struct mbox_controller *mbox)
{
@@ -530,7 +533,31 @@ static irqreturn_t imx_mu_isr(int irq, void *p)
struct mbox_chan *chan = p;
struct imx_mu_priv *priv = to_imx_mu_priv(chan->mbox);
struct imx_mu_con_priv *cp = chan->con_priv;
- u32 val, ctrl;
+ u32 i, val, ctrl;
+ u32 gips = 0, gies = 0;
+ u32 mu_cr = imx_mu_read(priv, priv->dcfg->xCR[IMX_MU_GCR]);
+ u32 mu_sr = imx_mu_read(priv, priv->dcfg->xSR[IMX_MU_GSR]);
+ u32 brdip = IMX_MU_xSR_BRDIP(priv->dcfg->type);
+
+ for (i = 0; i < IMX_MU_GIP_NO; i++) {
+ gips |= IMX_MU_xSR_GIPn(priv->dcfg->type, i);
+ gies |= IMX_MU_xCR_GIEn(priv->dcfg->type, i);
+ }
+ /* Keep only GIEn bits that are disabled */
+ gies &= (~mu_cr);
+ /* Keep only GIPn bits that are pending */
+ gips &= mu_sr;
+ /* Keep only GIPn bits that have the corresponding GIEn bits disabled */
+ gips &= gies;
+
+ /*
+ * Clear the BRDIP bit, processor B-side is out of reset,
+ * which also clears general purpose interrupt 3
+ */
+ if (mu_sr & brdip)
+ gips |= brdip;
+ /* Clear pending bits for the general purpose interrupts that are not enabled */
+ imx_mu_write(priv, gips, priv->dcfg->xSR[IMX_MU_GSR]);
switch (cp->type) {
case IMX_MU_TYPE_TX:
--
2.25.1
More information about the linux-arm-kernel
mailing list