[PATCH 03/16] tty: serial: 8250_core: read only RX if there is something in the FIFO

Peter Hurley peter at hurleysoftware.com
Tue Feb 10 09:46:32 PST 2015


On 02/10/2015 07:04 AM, Nicolas Schichan wrote:
> On 02/10/2015 12:34 AM, Peter Hurley wrote:
>> Hi Nicolas,
>>
>> Thanks for the report.
>>
> [...]
>>> When a caracter is received on the UART while the kernel is printing
>>> the boot messages, as soon as the kernel configures the UART for
>>> receiving (after root filesystem mount), it gets stuck printing the
>>> following message repeatedly:
>>>
>>> serial8250: too much work for irq29
>>>
>>> Once stuck, the reception of another character allows the boot process
>>> to finish.
>>>
>>> From what I can gather, when we hit that, the UART_IIR_NO_INT is 0 (so the
>>> interrupt is raised), but the UART_LSR_DR bit is 0 as well so the UART_RX
>>> register is never read to clear the interrupt.
>>
>> The "too much work" message means serial8250_handle_irq() is returning 0,
>> ie., not handled. Which in turn means IIR indicates no interrupt is pending
>> (UART_IIR_NO_INT == 1).
>>
>> Can you log the register values for LSR and IIR at both patch locations
>> in serial8250_do_startup()?
>>
>> (I can get you a debug patch, if necessary. Let me know)
> 
> Hi Peter,
> 
> Thanks for your reply.
> 
> Here is what I have when the issue is triggered:
> 
> [   12.154877] lsr 0x60 / iir 0x01
> [   12.158071] lsr 0x60 / iir 0x01
> [   12.161438] serial8250: too much work for irq29
> [   12.165982] lsr 0x60 / iir 0x0c
> [   12.169354] serial8250: too much work for irq29
> [   12.173900] lsr 0x60 / iir 0x0c
> (previous two messages are repeated and printk_ratelimited())

Thanks for this information; I see I was wrong about the cause of message.

I think what happens during startup is that on this silicon clearing
the rx fifo (by serial8250_clear_fifos()) clears data ready but not
the rx timeout condition which causes a spurious rx interrupt when
interrupts are enabled.

So caught between two broken UARTs: one that underflows its rx fifo because
of unsolicited rx reads and the other that generates spurious interrupt
without unsolicited rx reads.


> When the issue is not triggered:
> 
> [   10.784871] lsr 0x60 / iir 0x01
> [   10.788066] lsr 0x60 / iir 0x01
> [   10.794734] VFS: Mounted root (nfs filesystem) readonly on device 0:13.
> [   10.801654] devtmpfs: mounted
> [   10.805169] Freeing unused kernel memory: 184K (807be000 - 807ec000)
> (userland takes over after that)
> 
> I have also displayed the IIR and LSR registers when the "too much fork for
> IRQ" condition is triggered.
> 
> In the serial8250_do_startup(), before the interrupt are unmasked at the end,
> the IIR looks sane and UART_IIR_NO_INT bit is set. When stuck
> serial8250_interrupt(), UART_IIR_NO_INT is cleared and the interrupt ID is set
> to 0xc which is not handled by the kernel at this time (the Kirkwood datasheet
>  indicates that it is some kind of timeout condition from what I can gather).

Yes, IIR == UART_IIR_RX_TIMEOUT is to used indicate that data is in the rx fifo
but has not reached the rx trigger level yet.

ATM, I'm not exactly sure if there is a safe way to clear the spurious interrupt
from the interrupt handler.

I'm fairly certain the only way to clear the rx timeout interrupt is to read
the rx fifo, but I think this would race with actual data arrival. IOW, there
might not be a way to determine if the data read is spurious or not.

Regards,
Peter Hurley

> Here is the corresponding debug patch, for reference (against a 3.19 kernel
> this time):
> 
> diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_
> index 11c6685..ed93741 100644
> --- a/drivers/tty/serial/8250/8250_core.c
> +++ b/drivers/tty/serial/8250/8250_core.c
> @@ -1661,6 +1661,12 @@ static int exar_handle_irq(struct uart_port *port)
>         return ret;
>  }
> 
> +static inline void show_lsr_iir(struct uart_port *port)
> +{
> +       printk("lsr 0x%02x / iir 0x%02x\n", serial_port_in(port, UART_LSR),
> +              serial_port_in(port, UART_IIR));
> +}
> +
>  /*
>   * This is the serial driver's interrupt routine.
>   *
> @@ -1705,6 +1711,8 @@ static irqreturn_t serial8250_interrupt(int irq, void *dev
>                         /* If we hit this, we're dead. */
>                         printk_ratelimited(KERN_ERR
>                                 "serial8250: too much work for irq%d\n", irq);
> +                       if (printk_ratelimit())
> +                               show_lsr_iir(port);
>                         break;
>                 }
>         } while (l != end);
> @@ -2107,6 +2115,7 @@ int serial8250_do_startup(struct uart_port *port)
>         /*
>          * Clear the interrupt registers.
>          */
> +       show_lsr_iir(port);
>         if (serial_port_in(port, UART_LSR) & UART_LSR_DR)
>                 serial_port_in(port, UART_RX);
>         serial_port_in(port, UART_IIR);
> @@ -2269,6 +2278,7 @@ dont_test_tx_en:
>          * saved flags to avoid getting false values from polling
>          * routines or the previous session.
>          */
> +       show_lsr_iir(port);
>         if (serial_port_in(port, UART_LSR) & UART_LSR_DR)
>                 serial_port_in(port, UART_RX);
>         serial_port_in(port, UART_IIR);
> 
> Regards,
> 




More information about the linux-arm-kernel mailing list