[PATCH 3/3] nvme: add KConfig options for debug features

Chaitanya Kulkarni chaitanyak at nvidia.com
Sun Dec 12 23:39:41 PST 2021


On 12/12/21 1:22 AM, Sagi Grimberg wrote:
> External email: Use caution opening links or attachments
> 
> 
>> From: Chaitanya Kulkarni <kch at nvidia.com>
>>
>> Add KConfig menu option to enable and disable gencounter debug
>> feature that uses config NVME_DEBUG_USE_CID_GENCTR.
>>
>> Signed-off-by: Chaitanya Kulkarni <kch at nvidia.com>
>> ---
>>   drivers/nvme/host/Kconfig | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
>> index dc0450ca23a3..dfa2609b7006 100644
>> --- a/drivers/nvme/host/Kconfig
>> +++ b/drivers/nvme/host/Kconfig
>> @@ -1,4 +1,14 @@
>>   # SPDX-License-Identifier: GPL-2.0-only
>> +menu "Debug (Enable driver debug features)"
>> +config NVME_DEBUG_USE_CID_GENCTR
>> +     bool "Enable command ID gen counter for spurious request 
>> completion"
>> +     depends on NVME_CORE
>> +     help
>> +       The NVM Express driver will use generation counter
>> +       when calculating the command id. This is needed to debug the
>> +       spurious request completions coming from a buggy controller.
> 
> This is not just to debug - it is also to protect against such a
> controller. What is the purpose of this config option anyways?
> The main distributions will (as they should) enable it anyways...

I can rewrite the text and rename it to "driver features".
We are protecting against such a controller which is not stable
(buggy), i.e. it is doing things which it shouldn't be
doing at the first place. Consider a case if controller is not
buggy then it adds instructions in the fast path which are not
needed at all.

A controller(s) that is used in the production environment goes
through qualification process from vendors and from the consumers
to make sure they are stable, something like spurious completions
detection is a basic part of the qualification, hence we should
allow user to configure genctr than forcing additional
instructions in the fast path and keep this pattern for future
such cases.

Maybe I didn't understand, can you please explain what are the
benefits of having gen-counter where controller is stable?

I'll wait to send V2 if you can suggest any other way
(than kconfig something like module param, I'm not sure) so user
can configure genctr calculations, please let me know.

-ck



More information about the Linux-nvme mailing list