[RFC PATCH 3/3] nvme: add the "debug" host driver

Fri Feb 4 03:34:23 PST 2022

On 04.02.2022 09:58, Chaitanya Kulkarni wrote:
>On 2/4/22 12:24 AM, Javier González wrote:
>> On 04.02.2022 07:58, Chaitanya Kulkarni wrote:
>>> On 2/3/22 22:28, Damien Le Moal wrote:
>>>> On 2/4/22 12:12, Chaitanya Kulkarni wrote:
>>>>>
>>>>>>>> One can instantiate scsi devices with qemu by using fake scsi
>>>>>>>> devices,
>>>>>>>> but one can also just use scsi_debug to do the same. I see both
>>>>>>>> efforts
>>>>>>>> as desirable, so long as someone mantains this.
>>>>>>>>
>>>>>
>>>>> Why do you think both efforts are desirable ?
>>>>
>>>> When testing code using the functionality, it is far easier to get said
>>>> functionality doing a simple "modprobe" rather than having to setup a
>>>> VM. C.f. running blktests or fstests.
>>>>
>>>
>>> agree on simplicity but then why do we have QEMU implementations for
>>> the NVMe features (e.g. ZNS, NVMe Simple Copy) ? we can just build
>>> memoery backed NVMeOF test target for NVMe controller features.
>>>
>>> Also, recognizing the simplicity I proposed initially NVMe ZNS
>>> fabrics based emulation over QEMU (I think I still have initial state
>>> machine implementation code for ZNS somewhere), those were "nacked" for
>>> the right reason, since we've decided go with QEMU and use that as a
>>> primary platform for testing, so I failed to understand what has
>>> changed.. since given that QEMU already supports NVMe simple copy ...
>>
>> I was not part of this conversation, but as I see it each approach give
>> a benefit. QEMU is fantastic for compliance testing and I am not sure
>> you get the same level of command analysis anywhere else; at least not
>> without writing dedicated code for this in a target.
>>
>> This said, when we want to test for race conditions, QEMU is very slow.
>
>Can you please elaborate the scenario and numbers for slowness of QEMU?

QEMU is an emulator, not a simulator. So we will not be able to stress
the host stack in the same way the null_blk device does. If we want to
test code in the NVMe driver then we need a way to have the equivalent
to the null_blk in NVMe. It seems like the nvme-loop target can achieve
this.

Does this answer your concern?

>
>For race conditions testing we can build error injection framework
>around the code implementation which present in kernel everywhere.

True. This is also a good way to do this.

>
>> For a software-only solution, we have experimented with something
>> similar to the nvme-debug code tha Mikulas is proposing. Adam pointed to
>> the nvme-loop target as an alternative and this seems to work pretty
>> nicely. I do not believe there should be many changes to support copy
>> offload using this.
>>
>
>If QEMU is so incompetent then we need to add every big feature into
>the NVMeOF test target so that we can test it better ? is that what
>you are proposing ? since if we implement one feature, it will be
>hard to nack any new features that ppl will come up with
>same rationale "with QEMU being slow and hard to test race
>conditions etc .."

In my opinion, if people want this and is willing to maintain it, there
is a case for it.

>
>and if that is the case why we don't have ZNS NVMeOF target
>memory backed emulation ? Isn't that a bigger and more
>complicated feature than Simple Copy where controller states
>are involved with AENs ?

I think this is a good idea.

>
>ZNS kernel code testing is also done on QEMU, I've also fixed
>bugs in the ZNS kernel code which are discovered on QEMU and I've not
>seen any issues with that. Given that simple copy feature is way smaller
>than ZNS it will less likely to suffer from slowness and etc (listed
>above) in QEMU.

QEMU is super useful: it is easy and it help identifying many issues.
But it is for compliance, not for performance. There was an effort to
make FEMU, but this seems to be an abandoned project.

>
>my point is if we allow one, we will be opening floodgates and we need
>to be careful not to bloat the code unless it is _absolutely
>necessary_ which I don't think it is based on the simple copy
>specification.

I understand, and this is a very valid point. It seems like the
nvme-loop device can give a lot of what we need; all the necessary extra
logic can go into the null_blk and then we do not need NVMe specific
code.

Do you see any inconvenient with this approach?

>
>> So in my view having both is not replication and it gives more
>> flexibility for validation, which I believe it is always good.
>>
>