omap-serial RX DMA polling?
Paul Walmsley
paul at pwsan.com
Tue Jan 24 05:47:29 EST 2012
On Tue, 24 Jan 2012, Russell King - ARM Linux wrote:
> On Tue, Jan 24, 2012 at 12:58:57AM -0700, Paul Walmsley wrote:
>
> > In a correctly-working RX PIO path, the driver is going to receive an
> > interrupt the moment the data is ready to be transferred from the FIFO.
>
> That's hellishly inefficient.
If the point is to minimize the receive latency, as Govindraj described
earlier, then setting an RX FIFO threshold to one byte is the way to go.
It certainly seems preferable to the use of a DMA RX path with a 1
microsecond polling timer. Ideally this would be something that the
serial user could tune.
> Generally, what you want for transmit is to wait for the TX FIFO to
> drain to maybe half full, and then reload it until it is completely
> full.
Interesting rule of thumb. For OMAP there are also power management
considerations. For example, if we can estimate the maximum amount of
time it will take for the CPU to refill the transmit FIFO, then the TX
FIFO threshold can be adjusted down to reduce the number of
wakeups/interrupts needed to transmit a buffer.
In fact from a narrow PM perspective, the ideal TX FIFO threshold would
basically be zero: to allow the entire FIFO to drain before waking the CPU
back up to refill it. There's no data loss restriction as there is with
the RX FIFO. Of course, many serial users couldn't tolerate such an
setting and still work acceptably. It would be nice if the driver could
allow serial users to override the estimate that it generates.
> For the RX FIFO, you want to set the watermark such that you get a
> decent number of bytes in there before the receive interrupt is
> raised, but not soo many that an overrun is likely.
One other constraint. If the RX FIFO threshold is set too high, then the
CPU is effectively prevented from entering a deep sleep state, since the
CPU has to be able to wake up in time to prevent an RX overrun. The lower
the RX FIFO threshold, the more time the CPU has to wake up, and the
deeper the sleep state the CPU can enter.
> One of the point of having FIFOs is that they batch up the transmit and
> receive activity to make it more efficient at servicing the UART.
Yep. Also, another point is to allow the servicer to enter a low power
state while the FIFOs fill or drain.
> Setting the FIFO levels to one character virtually negates the point of
> having FIFOs - there is no point setting the TX FIFO to raise an
> interrupt when there's one character space left. As has already been
> reported, this just puts the interrupt rate up, and means you waste a
> lot more CPU (or bus) time servicing the transmit path.
In the case of this particular patchset, there was indeed a point to
setting the TX FIFO to 1; it was to work around a hardware bug. As the
patch description stated, it's a pretty nasty penalty that is worth
avoiding if at all possible[*]. I'm not endorsing that as an appropriate
setting outside of a bug workaround.
> As for RX DMA vs RX PIO, that depends on the UART (I don't know how
> OMAPs UARTs behave.) To sanely use RX DMA, you need the UART to raise
> the RX timeout interrupt after characters have been offloaded by the
> RX DMA. Lets saying that RX FIFO is 32 bytes deep, and it's set to
> raise the RX DMA request at 16 bytes full. If you program the DMA
> controller to burst 16 bytes off the RX FIFO, you'll empty it and
> it'll never raise the RX timeout interrupt. So you'll need to know
> how many characters you're expecting.
>
> If on the other hand you burst 8 bytes off the RX FIFO, you'll leave
> 8 bytes in the FIFO. If the UART works properly, it will raise an
> RX timeout interrupt after N bit periods where the RX line is inactive.
>
> What that means is that during a burst of RX activity, your DMA takes
> the strain of receiving characters, and you process those characters
> when either the RX buffer becomes full or when there's a pause in
> reception. This gives good efficiency during bursts while maintaining
> interactivity - to the same levels as that expected by RX PIO using
> the FIFO.
Well, Govindraj has some low-latency requirement, and no way to specify
how many bytes he's expecting. So if RX DMA is going to be used, the
driver will still need some kind of timer to flush any bytes that could be
stuck in the middle of a DMA transfer. This still seems like a case where
RX PIO would do a better job; no need for a timer, and immediate
notification when a character arrives, if the threshold is set that way.
As far as the RX timeout goes, those don't seem to be delivered properly
when the CPU is in a low-power state. This is probably due to the
previously-mentioned hardware bug, although it could be due to a driver
bug. So we may be out of luck there. We (meaning the people working on
OMAP) also need to figure out here what the OMAP UART RX timeout would
theoretically be, since it doesn't appear to be documented.
Thanks for the comments.
- Paul
* There is another workaround for this bug under development here that
shouldn't require changing the TX FIFO. If it passes testing here, then
the TX FIFO of 1 shouldn't be needed.
More information about the linux-arm-kernel
mailing list