[PATCH v3 for-4.13 2/6] mlx5: move affinity hints assignments to generic code

Saeed Mahameed saeedm at dev.mellanox.co.il
Thu Jun 8 04:42:07 PDT 2017


On Thu, Jun 8, 2017 at 1:16 PM, Sagi Grimberg <sagi at grimberg.me> wrote:
>
>>>> is there any reason you want to start assining vectors on the local
>>>> node?  This is doable, but would complicate the code quite a bit
>>>> so it needs a good argument.
>>>
>>>
>>>
>>> My interpretation is that mlx5 tried to do this for the (rather esoteric
>>> in my mind) case where the platform does not have enough vectors for the
>>> driver to allocate percpu. In this case, the next best thing is to stay
>>> as close to the device affinity as possible.
>>>
>>
>> No, we did it for the reason that mlx5e netdevice assumes that
>> IRQ[0]..IRQ[#num_numa/#cpu_per_numa]
>> are always bound to the numa close to the device. and the mlx5e driver
>> choose those IRQs to spread
>> the RSS hash only into them and never uses other IRQs/Cores
>
>
> OK, that explains a lot of weirdness I've seen with mlx5e.
>
> Can you explain why you're using only a single numa node for your RSS
> table? What does it buy you? You open RX rings for _all_ cpus but
> only spread on part of them? I must be missing something here...

Adding Tariq,

this is also part of the weirdness :), we do that to make sure any OOB
test you run you always get the best performance
and we will guarantee to always use close numa cores.

we open RX rings on all of the cores in case if the user want to
change the RSS table to point to the whole thing on the fly "ethtool
-X"

But we are willing to change that, Tariq can provide the patch,
without changing this mlx5e is broken.



More information about the Linux-nvme mailing list