[PATCH v8 10/12] blk-mq: use hk cpus only when isolcpus=io_queue is enabled
Thomas Gleixner
tglx at linutronix.de
Fri Sep 12 07:31:55 PDT 2025
On Fri, Sep 12 2025 at 10:32, Daniel Wagner wrote:
> On Wed, Sep 10, 2025 at 10:20:26AM +0200, Thomas Gleixner wrote:
>> > The cpu_online_mask might change over time, it's not a static bitmap.
>> > Thus it's necessary to update the blk_hk_online_mask. Doing some sort of
>> > caching is certainly possible. Given that we have plenty of cpumask
>> > logic operation in the cpu_group_evenly code path later, I am not so
>> > sure this really makes a huge difference.
>>
>> Sure, but none of this is serialized against CPU hotplug operations. So
>> the resulting mask, which is handed into the spreading code can be
>> concurrently modified. IOW it's not as const as the code claims.
>
> Thanks for explaining.
>
> In group_cpu_evenly:
>
> /*
> * Make a local cache of 'cpu_present_mask', so the two stages
> * spread can observe consistent 'cpu_present_mask' without holding
> * cpu hotplug lock, then we can reduce deadlock risk with cpu
> * hotplug code.
> *
> * Here CPU hotplug may happen when reading `cpu_present_mask`, and
> * we can live with the case because it only affects that hotplug
> * CPU is handled in the 1st or 2nd stage, and either way is correct
> * from API user viewpoint since 2-stage spread is sort of
> * optimization.
> */
> cpumask_copy(npresmsk, data_race(cpu_present_mask));
The present mask is very different from the online mask. The present
mask only changes on physical hotplug when:
- a offline CPU is removed from the present set of CPUs
- a offline CPU is added to it.
In neither case the CPU can be involved in any operation related to the
actual offline/online operations.
Also contrary to your approach, this code takes the possibility of
a concurrently changing mask into account by taking a racy snapshot,
which is immutable for the following operation.
What you are doing with that static mask, makes it a target of
concurrent modification, which is obviously a recipe for subtle bugs.
> Turns out the two stage spread just needs consistent 'cpu_present_mask',
> and remove the CPU hotplug lock by storing it into one local cache. This
> way doesn't change correctness, because all CPUs are still covered.
>
> This sounds like I should do something similar with cpu_online_mask.
Indeed.
Thanks,
tglx
More information about the Linux-nvme
mailing list