[PATCHv4 6/6] dmaengine: mv_xor: optimize performance by using a subset of the XOR channels
Vinod Koul
vinod.koul at intel.com
Wed Aug 19 10:05:26 PDT 2015
On Wed, Jul 08, 2015 at 04:28:19PM +0200, Thomas Petazzoni wrote:
> Due to how async_tx behaves internally, having more XOR channels than
> CPUs is actually hurting performance more than it improves it, because
> memcpy requests get scheduled on a different channel than the XOR
> requests, but async_tx will still wait for the completion of the
> memcpy requests before scheduling the XOR requests.
>
> It is in fact more efficient to have at most one channel per CPU,
> which this patch implements by limiting the number of channels per
> engine, and the number of engines registered depending on the number
> of availables CPUs.
>
> Marvell platforms are currently available in one CPU, two CPUs and
> four CPUs configurations:
>
> - in the configurations with one CPU, only one channel from one
> engine is used.
>
> - in the configurations with two CPUs, only one channel from each
> engine is used (they are two XOR engines)
>
> - in the configurations with four CPUs, both channels of both engines
> are used.
Applied, thanks
--
~Vinod
More information about the linux-arm-kernel
mailing list