Re: [PATCH blktests v3] nvme/046: test queue count changes on reconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 14, 2022 at 01:37:38PM +0300, Sagi Grimberg wrote:
> 
> > > > > FYI, each blktests test case can define DMESG_FILTER not to fail with specific
> > > > > keywords in dmesg. Test cases meta/011 and block/028 are reference use
> > > > > cases.
> > > > 
> > > > Ah okay, let me look into it.
> > > 
> > > So I made the state read function a bit more robust (test if state file
> > > exists) and the it turns out this made rdma happy(??) but tcp is still
> > > breaking.
> > 
> > s/tcp/fc/
> > 
> > On closer inspection I see following sequence for fc:
> > 
> > [399664.863585] nvmet: connect request for invalid subsystem blktests-subsystem-1!
> > [399664.863704] nvme nvme0: Connect Invalid Data Parameter, subsysnqn "blktests-subsystem-1"
> > [399664.863758] nvme nvme0: NVME-FC{0}: reset: Reconnect attempt failed (16770)
> > [399664.863784] nvme nvme0: NVME-FC{0}: reconnect failure
> > [399664.863837] nvme nvme0: Removing ctrl: NQN "blktests-subsystem-1"
> > 
> > When the host tries to reconnect to a non existing controller (the test
> > called _remove_nvmet_subsystem_from_port()) the target returns 0x4182
> > (NVME_SC_DNR|NVME_SC_READ_ONLY(?)).
> 
> That is not something that the target is supposed to be doing, I have no
> idea why this is sent. Perhaps this is something specific to the fc
> implementation?

Okay, I'll look into this.

>  So arguably fc behaves correct by
> > stopping the reconnects. tcp and rdma just ignore the DNR.
> 
> DNR means do not retry the command, it says nothing about do not attempt
> a future reconnect...

That makes sense.

> > If we agree that the fc behavior is the right one, then the nvmet code
> > needs to be changed so that when the qid_max attribute changes it forces
> > a reconnect. The trick with calling _remove_nvmet_subsystem_from_port()
> > to force a reconnect is not working. And tcp/rdma needs to honor the
> > DNR.
> 
> tcp/rdma honor DNR afaik.

I did interpret DNR wrongly. As you pointed out it's just about the
command not about the reconnect attempt.

So do we agree the fc host should not stop reconnecting? James?



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux