[PATCH v5 0/6] Add Tegra241 (Grace) CMDQV Support (part 1/2)

Jason Gunthorpe jgg at nvidia.com
Wed Apr 17 05:29:26 PDT 2024


On Wed, Apr 17, 2024 at 09:45:34AM +0000, Shameerali Kolothum Thodi wrote:

> Just to add to that. One idea could be like to have a case where when ECMDQs are 
> detected, use that for issuing limited set of cmds(like stage 1 TLBIs) and use the
> normal cmdq for rest. Since we use stage 1 for both host and for Guest nested cases
> and TLBIs are the bottlenecks in most cases I think this should give performance
> benefits.

There is definately options to look at to improve the performance
here.

IMHO the design of the ECMDQ largely seems to expect 1 queue per-cpu
and then we move to a lock-less design where each CPU uses it's own
private per-cpu queue. In this case a VMM calling the kernel to do
invalidation would often naturally use a thread originating on a pCPU
bound to a vCPU which is substantially exclusive to the VM.

Jason



More information about the linux-arm-kernel mailing list