[PATCH v4 1/1] nvme: multipath: Implemented new iopolicy "queue-depth"
John Meneghini
jmeneghi at redhat.com
Wed May 22 09:23:51 PDT 2024
On 5/22/24 11:56, Keith Busch wrote:
> On Wed, May 22, 2024 at 11:42:12AM -0400, John Meneghini wrote:
>> +static void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)
>> +{
>> + struct nvme_ctrl *ctrl;
>> + int old_iopolicy = READ_ONCE(subsys->iopolicy);
>> +
>> + WRITE_ONCE(subsys->iopolicy, iopolicy);
>> +
>> + /* iopolicy changes reset the counters and clear the mpath by design */
>> + mutex_lock(&nvme_subsystems_lock);
>> + list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
>> + atomic_set(&ctrl->nr_active, 0);
>
> Can you me understand why this is a desirable feature? Unless you
> quiesce everything at some point, you'll always have more unaccounted
> requests on whichever path has higher latency. That sounds like it
> defeats the goals of this io policy.
This is true. And as a matter of practice I never change the IO policy when IOs are in flight. I always stop the IO first.
But we can't stop any user from changing the IO policy again and again. So I'm not sure what to do.
If you'd like I add the 'if (old_iopolicy == iopolicy) return;' here, but that's not going to solve the problem of inaccurate
counters when users start flipping io policies around. with IO inflight. There is no synchronization between io submission
across controllers and changing the policy so I expect changing between round-robin and queue-depth with IO inflight suffers
from the same problem... though not as badly.
I'd rather take this patch now and figure out how to fix the problem with another patch in the future. Maybe we can check the
io stats and refuse to change the policy of they are not zero....
>> @@ -1061,6 +1066,9 @@ static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
>> {
>> return false;
>> }
>> +static inline void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)
>> +{
>> +}
>> #endif /* CONFIG_NVME_MULTIPATH */
>
> You can remove this stub function since the only caller resides in a
> CONFIG_NVME_MULTIPATH file.
>
I can do that. Since I have to spin a version 5 patch, do you want me to make the change above?
Is there anything else?
/John
More information about the Linux-nvme
mailing list