[PATCHv4 6/6] dmaengine: mv_xor: optimize performance by using a subset of the XOR channels

Vinod Koul vinod.koul at intel.com
Wed Aug 19 10:05:26 PDT 2015


On Wed, Jul 08, 2015 at 04:28:19PM +0200, Thomas Petazzoni wrote:
> Due to how async_tx behaves internally, having more XOR channels than
> CPUs is actually hurting performance more than it improves it, because
> memcpy requests get scheduled on a different channel than the XOR
> requests, but async_tx will still wait for the completion of the
> memcpy requests before scheduling the XOR requests.
> 
> It is in fact more efficient to have at most one channel per CPU,
> which this patch implements by limiting the number of channels per
> engine, and the number of engines registered depending on the number
> of availables CPUs.
> 
> Marvell platforms are currently available in one CPU, two CPUs and
> four CPUs configurations:
> 
>  - in the configurations with one CPU, only one channel from one
>    engine is used.
> 
>  - in the configurations with two CPUs, only one channel from each
>    engine is used (they are two XOR engines)
> 
>  - in the configurations with four CPUs, both channels of both engines
>    are used.

Applied, thanks

-- 
~Vinod




More information about the linux-arm-kernel mailing list