[PATCH 7/8] serial: imx: use readl() to optimize FIFO reading loop

Sergey Organov sorganov at gmail.com
Tue Jan 17 05:22:51 PST 2023


Uwe Kleine-König <u.kleine-koenig at pengutronix.de> writes:

> On Fri, Jan 13, 2023 at 09:43:33PM +0300, Sergey Organov wrote:
>> Use readl() instead of heavier imx_uart_readl() in the Rx ISR, as we know
>> we read registers that must not be cached.
>> 
>> Signed-off-by: Sergey Organov <sorganov at gmail.com>
>> ---
>>  drivers/tty/serial/imx.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
>> index be00362b8b67..f4236e8995fa 100644
>> --- a/drivers/tty/serial/imx.c
>> +++ b/drivers/tty/serial/imx.c
>> @@ -890,14 +890,15 @@ static irqreturn_t __imx_uart_rxint(int irq, void *dev_id)
>>  	struct imx_port *sport = dev_id;
>>  	unsigned int rx, flg;
>>  	struct tty_port *port = &sport->port.state->port;
>> +	typeof(sport->port.membase) membase = sport->port.membase;
>>  	u32 usr2;
>>  
>>  	/* If we received something, check for 0xff flood */
>> -	usr2 = imx_uart_readl(sport, USR2);
>> +	usr2 = readl(membase + USR2);
>>  	if (usr2 & USR2_RDR)
>>  		imx_uart_check_flood(sport, usr2);
>>  
>> -	while ((rx = imx_uart_readl(sport, URXD0)) & URXD_CHARRDY) {
>> +	while ((rx = readl(membase + URXD0)) & URXD_CHARRDY) {
>>  		flg = TTY_NORMAL;
>>  		sport->port.icount.rx++;
>
> One of the motivations to introduce imx_uart_readl was to have a single
> place to add a debug output to be able to inspect what the driver is
> doing.
>
> I wonder where your need for higher speed comes from and if the compiler
> really generates more effective code with your change.

Mostly it's because I'm obviously slowing things down a bit with the
patch to fight the flood, so I feel obliged to get things back on par
with the origin. Then, higher speed, let alone the time spent with
interrupts disabled and/or spinlocks taken, is always one of generic
goals for me.

As for the generated code, with this patch I don't aim to affect code
generation, I rather avoid execution of part of existing code while
being on the most critical path. It should be quite obvious that not
executing some code is at least not slower than executing it.

>
> Please either drop the patch from your series or provide the differences
> the compiler produces and a benchmark.

If your only objection against this patch is the desire to keep a single
place to add debug output, I'll be happy to tune the resulting code to
still have one.

That said, before we make a decision, could you please tell why register
shadows that the imx_uart_readl/writel are dealing with are needed in
the first place? It looks like all the registers that are shadowed are
readable as well. What's going on here, and if it happens to be a
speed-up, do we have any benchmarks?

Thanks,
-- Sergey Organov



More information about the linux-arm-kernel mailing list