[PATCH blktests v1 2/3] nvme/rc: Avoid triggering host nvme-cli autoconnect

Hannes Reinecke hare at suse.de
Wed Jul 12 23:00:28 PDT 2023


On 7/13/23 02:12, Max Gurtovoy wrote:
> 
> 
> On 12/07/2023 15:04, Daniel Wagner wrote:
>> On Mon, Jul 10, 2023 at 07:30:20PM +0300, Max Gurtovoy wrote:
>>>
>>>
>>> On 10/07/2023 18:03, Daniel Wagner wrote:
>>>> On Mon, Jul 10, 2023 at 03:31:23PM +0300, Max Gurtovoy wrote:
>>>>> I think it is more than just commit message.
>>>>
>>>> Okay, starting to understand what's the problem.
>>>>
>>>>> A lot of code that we can avoid was added regarding the --context 
>>>>> cmdline
>>>>> argument.
>>>>
>>>> Correct and it's not optional to get the tests passing for the fc 
>>>> transport.
>>>
>>> why the fc needs the --context to pass tests ?
>>
>> A typical nvme test consists out of following steps (nvme/004):
>>
>> // nvme target setup (1)
>>     _create_nvmet_subsystem "blktests-subsystem-1" "${loop_dev}" \
>>         "91fdba0d-f87b-4c25-b80f-db7be1418b9e"
>>     _add_nvmet_subsys_to_port "${port}" "blktests-subsystem-1"
>>
>> // nvme host setup (2)
>>     _nvme_connect_subsys "${nvme_trtype}" blktests-subsystem-1
>>
>>     local nvmedev
>>     nvmedev=$(_find_nvme_dev "blktests-subsystem-1")
>>     cat "/sys/block/${nvmedev}n1/uuid"
>>     cat "/sys/block/${nvmedev}n1/wwid"
>>
>> // nvme host teardown (3)
>>     _nvme_disconnect_subsys blktests-subsystem-1
>>
>> // nvme target teardown (4)
>>     _remove_nvmet_subsystem_from_port "${port}" "blktests-subsystem-1"
>>     _remove_nvmet_subsystem "blktests-subsystem-1"
>>
>>
>> The corresponding output with --context
>>
>>   run blktests nvme/004 at 2023-07-12 13:49:50
>> // (1)
>>   loop0: detected capacity change from 0 to 32768
>>   nvmet: adding nsid 1 to subsystem blktests-subsystem-1
>>   nvme nvme2: NVME-FC{0}: create association : host wwpn 
>> 0x20001100aa000002  rport wwpn 0x20001100aa000001: NQN 
>> "blktests-subsystem-1"
>>   (NULL device *): {0:0} Association created
>>   [174] nvmet: ctrl 1 start keep-alive timer for 5 secs
>> // (2)
>>   nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 
>> for NQN 
>> nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
>>   [374] nvmet: adding queue 1 to ctrl 1.
>>   [1138] nvmet: adding queue 2 to ctrl 1.
>>   [73] nvmet: adding queue 3 to ctrl 1.
>>   [174] nvmet: adding queue 4 to ctrl 1.
>>   nvme nvme2: NVME-FC{0}: controller connect complete
>>   nvme nvme2: NVME-FC{0}: new ctrl: NQN "blktests-subsystem-1"
>> // (3)
>>   nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
>> // (4)
>>   [1138] nvmet: ctrl 1 stop keep-alive
>>   (NULL device *): {0:0} Association deleted
>>   (NULL device *): {0:0} Association freed
>>   (NULL device *): Disconnect LS failed: No Association
>>
>>
>> and without --context
>>
>>   run blktests nvme/004 at 2023-07-12 13:50:33
>> // (1)
>>   loop1: detected capacity change from 0 to 32768
>>   nvmet: adding nsid 1 to subsystem blktests-subsystem-1
>>   nvme nvme2: NVME-FC{0}: create association : host wwpn 
>> 0x20001100aa000002  rport wwpn 0x20001100aa000001: NQN 
>> "nqn.2014-08.org.nvmexpress.discovery"
> 
> why does this association to discovery controller created ? because of 
> some system service ?
> 
Yes. There are nvme-autoconnect udev rules and systemd services 
installed per default (in quite some systems now).
And it's really hard (if not impossible) to disable these services (as 
we cannot be sure how they are named, hence we wouldn't know which 
service to disable.

> can we configure the blktests subsystem not to be discovered or add some 
> access list to it ?
> 
But that's precisely what the '--context' thing is attempting to do ...

[ .. ]
>>>>
>>>> It really solves the problem that the autoconnect setup of nvme-cli is
>>>> distrubing the tests (*). The only other way I found to stop the 
>>>> autoconnect is by disabling the udev rule completely. If autoconnect 
>>>> isn't enabled the context isn't necessary.
>>>> Though changing system configuration from blktests seems at bit 
>>>> excessive.
>>>
>>> we should not stop any autoconnect during blktests. The autoconnect 
>>> and all the system admin services should run normally.
>>
>> I do not agree here. The current blktests are not designed for run as
>> intergration tests. Sure we should also tests this but currently 
>> blktests is just not there and tcp/rdma are not actually covered anyway.
> 
> what do you mean tcp/rdma not covered ?
> 
Because there is no autoconnect functionality for tcp/rdma.
For FC we have full topology information, and the driver can emit udev 
messages whenever a NVMe port appears in the fabrics (and the systemd 
machinery will then start autoconnect).
For TCP/RDMA we do not have this, so really there's nothing which could 
send udev events (discounting things like mDNS and nvme-stas for now).

> And maybe we should make several changes in the blktests to make it 
> standalone without interfering the existing configuration make by some 
> system administrator.
> 
??
But this is what we are trying with this patches.
The '--context' flag only needs to be set for the blktests, to inform 
the rest of the system that these subsystems/configuration is special 
and should be exempted from 'normal' system processing.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman




More information about the Linux-nvme mailing list