[PATCH] mailbox: cancel timer before starting it

Sudeep Holla sudeep.holla at arm.com
Fri Oct 16 04:44:28 EDT 2020


On Thu, Oct 15, 2020 at 03:29:35PM +0100, Ionela Voinescu wrote:
> Hi Jerome,
> 
> On Thursday 15 Oct 2020 at 15:58:30 (+0200), Jerome Brunet wrote:
> > 
> > On Thu 15 Oct 2020 at 15:46, Ionela Voinescu <ionela.voinescu at arm.com> wrote:
> > 
> > > Hi guys,
> > >
> > > On Wednesday 23 Sep 2020 at 14:39:16 (+0200), Jerome Brunet wrote:
> > >> If the txdone is done by polling, it is possible for msg_submit() to start
> > >> the timer while txdone_hrtimer() callback is running. If the timer needs
> > >> recheduling, it could already be enqueued by the time hrtimer_forward_now()
> > >> is called, leading hrtimer to loudly complain.
> > >> 
> > >> WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110
> > >> CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5
> > >> Hardware name: Libre Computer AML-S805X-AC (DT)
> > >> Workqueue: events_freezable_power_ thermal_zone_device_check
> > >> pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--)
> > >> pc : hrtimer_forward+0xc4/0x110
> > >> lr : txdone_hrtimer+0xf8/0x118
> > >> [...]
> > >> 
> > >> Canceling the timer before starting it ensure that the timer callback is
> > >> not running when the timer is started, solving this race condition.
> > >> 
> > >> Fixes: 0cc67945ea59 ("mailbox: switch to hrtimer for tx_complete polling")
> > >> Reported-by: Da Xue <da at libre.computer>
> > >> Signed-off-by: Jerome Brunet <jbrunet at baylibre.com>
> > >> ---
> > >>  drivers/mailbox/mailbox.c | 8 ++++++--
> > >>  1 file changed, 6 insertions(+), 2 deletions(-)
> > >> 
> > >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
> > >> index 0b821a5b2db8..34f9ab01caef 100644
> > >> --- a/drivers/mailbox/mailbox.c
> > >> +++ b/drivers/mailbox/mailbox.c
> > >> @@ -82,9 +82,13 @@ static void msg_submit(struct mbox_chan *chan)
> > >>  exit:
> > >>  	spin_unlock_irqrestore(&chan->lock, flags);
> > >>  
> > >> -	if (!err && (chan->txdone_method & TXDONE_BY_POLL))
> > >> -		/* kick start the timer immediately to avoid delays */
> > >> +	if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
> > >> +		/* Disable the timer if already active ... */
> > >> +		hrtimer_cancel(&chan->mbox->poll_hrt);
> > >> +
> > >> +		/* ... and kick start it immediately to avoid delays */
> > >>  		hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
> > >> +	}
> > >>  }
> > >>  
> > >>  static void tx_tick(struct mbox_chan *chan, int r)
> > >
> > > I've tracked a regression back to this commit. Details to reproduce:
> > 
> > Hi Ionela,
> > 
> > I don't have access to your platform and I don't get what is going on
> > from the log below.
> > 
> > Could you please give us a bit more details about what is going on ?
> > 
> 
> I'm not familiar with the mailbox subsystem, so the best I can do right
> now is to add Sudeep to Cc, in case this conflicts in some way with the
> ARM MHU patches [1].
>

Not it can't be doorbell driver as we use SCPI(old firmware) with upstream
MHU driver as is limiting the number of channels to be used.

> In the meantime I'll get some traces and get more familiar with the
> code.
>

I will try that too.

-- 
Regards,
Sudeep



More information about the linux-amlogic mailing list