[EXT] Re: iMX6/UART imprecise external abort

Uwe Kleine-König u.kleine-koenig at pengutronix.de
Tue Jan 7 14:24:06 PST 2020


On Mon, Dec 23, 2019 at 11:16:27AM +0100, Uwe Kleine-König wrote:
> Hello,
> 
> On Mon, Dec 23, 2019 at 01:53:44AM +0000, Andy Duan wrote:
> > From: Fabio Estevam <festevam at gmail.com> Sent: Saturday, December 21, 2019 8:03 PM
> > > On Sat, Dec 21, 2019 at 4:31 AM Andy Duan <fugang.duan at nxp.com> wrote:
> > > 
> > > > We should ensure the RX FIFO data are not missed since they are valid data.
> > > > To compatible DMA and cpu PIO mode, to receive all RX FIFO data when
> > > > start to send, it will involve complex code logic.
> > > > So I suggest to enable the flag "SER_RS485_RX_DURING_TX", and force to
> > > > enable the flag for imx uart RS485 driver.
> > > 
> > > Inside imx_uart_rs485_config() we have:
> > > 
> > > if (rs485conf->flags & SER_RS485_ENABLED) {
> > >        /* Enable receiver if low-active RTS signal is requested */
> > >        if (sport->have_rtscts &&  !sport->have_rtsgpio &&
> > >            !(rs485conf->flags & SER_RS485_RTS_ON_SEND))
> > >                     rs485conf->flags |= SER_RS485_RX_DURING_TX;
> > > 
> > > Maybe the if() logic needs to be changed so that the
> > > SER_RS485_RX_DURING_TX flag could be set in Andre's case?
> > 
> > I think let the config always is enabled unconditionally: 
> > 	rs485conf->flags |= SER_RS485_RX_DURING_TX;
> 
> I think it should be possible to fix without forcing
> SER_RS485_RX_DURING_TX (which might have surprising effects for
> userspace). Actually I was convinced this problem was fixed in a
> different way in the imx driver already since 76821e222c18 ("serial:
> imx: ensure that RX irqs are off if RX is off").
> 
> The key idea is to disable the RX irq and dma request and only then
> disable RX. This way it is not given that the RX FIFO is empty on
> disable, but the characters are not read and so the exception doesn't
> happen.
> 
> I'll take a deeper look after my vacations in the new year, probably
> some rx485 path was missed in the fix.

I took a look now and found a race condition that might trigger this
problem. The following can happen (in the non-DMA case):


	imx_uart_int()
	  usr1 = imx_uart_readl(sport, USR1);
	  ...
	  ucr1 = imx_uart_readl(sport, UCR1);
	  ucr2 = imx_uart_readl(sport, UCR2);
	  ...
	  if ((ucr1 & UCR1_RRDYEN) == 0)
	    usr1 &= ~USR1_RRDY;
	  if ((ucr2 & UCR2_ATEN) == 0)
	    usr1 &= ~USR1_AGTIM;
	    						imx_uart_start_tx()
							  imx_uart_stop_rx()
							    ...
							    ucr1 &= ~UCR1_RRDYEN;
							    ucr2 &= ~(UCR2_ATEN | UCR2_RXEN)
							    imx_uart_writel(sport, ucr1, UCR1);
							    imx_uart_writel(sport, ucr2, UCR2);

	  if (usr1 & (USR1_RRDY | USR1_AGTIM)) {
	    imx_uart_rxint(irq, dev_id);
	    ...
	  }

Which results in the left execution thread to read from the RX register
while RXEN is off and so trigger the fault.

Currently imx_uart_rxint() grabs the port lock (and imx_uart_start_tx()
also holds it), and so the decision to call imx_uart_rxint() is done
without holding the lock.

The fix is to do the check for UCR1_RRDYEN and UCR2_ATEN (and all the
other similar checks) under the port lock.

So assuming the problem is indeed what we are experiencing here, the
workaround by Andre (i.e. run the UART user and the UART irq on the same
cpu) is a good one.

I will look into this again tomorrow when I'm well rested and create a
patch.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |



More information about the linux-arm-kernel mailing list