On 13/07/2023 9:00, Hannes Reinecke wrote:
On 7/13/23 02:12, Max Gurtovoy wrote:
On 12/07/2023 15:04, Daniel Wagner wrote:
On Mon, Jul 10, 2023 at 07:30:20PM +0300, Max Gurtovoy wrote:
On 10/07/2023 18:03, Daniel Wagner wrote:
On Mon, Jul 10, 2023 at 03:31:23PM +0300, Max Gurtovoy wrote:
I think it is more than just commit message.
Okay, starting to understand what's the problem.
A lot of code that we can avoid was added regarding the --context
cmdline
argument.
Correct and it's not optional to get the tests passing for the fc
transport.
why the fc needs the --context to pass tests ?
A typical nvme test consists out of following steps (nvme/004):
// nvme target setup (1)
_create_nvmet_subsystem "blktests-subsystem-1" "${loop_dev}" \
"91fdba0d-f87b-4c25-b80f-db7be1418b9e"
_add_nvmet_subsys_to_port "${port}" "blktests-subsystem-1"
// nvme host setup (2)
_nvme_connect_subsys "${nvme_trtype}" blktests-subsystem-1
local nvmedev
nvmedev=$(_find_nvme_dev "blktests-subsystem-1")
cat "/sys/block/${nvmedev}n1/uuid"
cat "/sys/block/${nvmedev}n1/wwid"
// nvme host teardown (3)
_nvme_disconnect_subsys blktests-subsystem-1
// nvme target teardown (4)
_remove_nvmet_subsystem_from_port "${port}" "blktests-subsystem-1"
_remove_nvmet_subsystem "blktests-subsystem-1"
The corresponding output with --context
run blktests nvme/004 at 2023-07-12 13:49:50
// (1)
loop0: detected capacity change from 0 to 32768
nvmet: adding nsid 1 to subsystem blktests-subsystem-1
nvme nvme2: NVME-FC{0}: create association : host wwpn
0x20001100aa000002 rport wwpn 0x20001100aa000001: NQN
"blktests-subsystem-1"
(NULL device *): {0:0} Association created
[174] nvmet: ctrl 1 start keep-alive timer for 5 secs
// (2)
nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1
for NQN
nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[374] nvmet: adding queue 1 to ctrl 1.
[1138] nvmet: adding queue 2 to ctrl 1.
[73] nvmet: adding queue 3 to ctrl 1.
[174] nvmet: adding queue 4 to ctrl 1.
nvme nvme2: NVME-FC{0}: controller connect complete
nvme nvme2: NVME-FC{0}: new ctrl: NQN "blktests-subsystem-1"
// (3)
nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
// (4)
[1138] nvmet: ctrl 1 stop keep-alive
(NULL device *): {0:0} Association deleted
(NULL device *): {0:0} Association freed
(NULL device *): Disconnect LS failed: No Association
and without --context
run blktests nvme/004 at 2023-07-12 13:50:33
// (1)
loop1: detected capacity change from 0 to 32768
nvmet: adding nsid 1 to subsystem blktests-subsystem-1
nvme nvme2: NVME-FC{0}: create association : host wwpn
0x20001100aa000002 rport wwpn 0x20001100aa000001: NQN
"nqn.2014-08.org.nvmexpress.discovery"
why does this association to discovery controller created ? because of
some system service ?
Yes. There are nvme-autoconnect udev rules and systemd services
installed per default (in quite some systems now).
And it's really hard (if not impossible) to disable these services (as
we cannot be sure how they are named, hence we wouldn't know which
service to disable.
Right. We shouldn't disable them IMO.
can we configure the blktests subsystem not to be discovered or add
some access list to it ?
But that's precisely what the '--context' thing is attempting to do ...
I'm not sure it is.
Exposing the subsystem is from the target configuration side.
Additionally, the --context (which is in the initiator/host side),
according to Daniel, is there to distinguish between different
invocations. I proposed that blktests subsystem will not be part of
discoverable fabric or protected somehow by access list. Therefore, no
additional invocation will happen.
[ .. ]
It really solves the problem that the autoconnect setup of nvme-cli is
distrubing the tests (*). The only other way I found to stop the
autoconnect is by disabling the udev rule completely. If
autoconnect isn't enabled the context isn't necessary.
Though changing system configuration from blktests seems at bit
excessive.
we should not stop any autoconnect during blktests. The autoconnect
and all the system admin services should run normally.
I do not agree here. The current blktests are not designed for run as
intergration tests. Sure we should also tests this but currently
blktests is just not there and tcp/rdma are not actually covered anyway.
what do you mean tcp/rdma not covered ?
Because there is no autoconnect functionality for tcp/rdma.
For FC we have full topology information, and the driver can emit udev
messages whenever a NVMe port appears in the fabrics (and the systemd
machinery will then start autoconnect).
For TCP/RDMA we do not have this, so really there's nothing which could
send udev events (discounting things like mDNS and nvme-stas for now).
And maybe we should make several changes in the blktests to make it
standalone without interfering the existing configuration make by some
system administrator.
??
But this is what we are trying with this patches.
The '--context' flag only needs to be set for the blktests, to inform
the rest of the system that these subsystems/configuration is special
and should be exempted from 'normal' system processing.
The --context is initiator configuration. I'm referring to changes in
the target configuration.
This will guarantee that things will work also in the environment where
we have nvme-cli without the --context flag.
Cheers,
Hannes