Re: Lock errors in iscsi gateway

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In data martedì 28 aprile 2020 18:41:27 CEST, Mike Christie ha scritto:

> Could you send me:
> 
> 1. The /var/log/messages for the initiator when you do IO and see those
> lock messages.

On the initiator (XenServer 7.1 which is based on CentOS AFAIK) the /var/log/messages is 
empty.
I (sporadicly) see:
Apr 29 09:00:36 xs-n1 systemd[1]: Starting Multipath Count Service...
Apr 29 09:00:36 xs-n1 systemd[1]: Started Multipath Count Service.
Apr 29 09:00:36 xs-n1 systemd[1]: Started Session 146 of user root.
Apr 29 09:00:36 xs-n1 systemd[1]: Starting Session 146 of user root.
Apr 29 09:00:40 xs-n1 multipathd: dm-3: remove map (uevent)
Apr 29 09:00:40 xs-n1 multipathd: dm-3: devmap not registered, can't remove
Apr 29 09:00:40 xs-n1 multipathd: dm-3: remove map (uevent)
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="PBD.get_all_records"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_uuid"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_name_label"];
Apr 29 09:00:40 xs-n1 mpathalert: [debug|xs-n1|2 ||mscgen] mpathalert=>xapi 
[label="host.get_all_records"];

 
> 2. The output of
> 
> From one of the gateways:
> # gwcli ls
> 
Attached (gwcli.txt)
> From the initiator node you send the /var/log/messages for:
> # iscsiadm -m session -P 3

attacched (iscsi-session.txt)

> # multipath -ll
> 

36001405d7480e5f84b94ab19ebeebd6c dm-0 LIO-ORG ,TCMU device     
size=3.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=50 status=active
| `- 2:0:0:0 sdc 8:32 active ready running
`-+- policy='queue-length 0' prio=10 status=enabled
  `- 3:0:0:0 sdb 8:16 active ready running

> 3. version info:
> 
> # uname -a

On the Initiator:
Linux xs-n1 4.4.0+2 #1 SMP Thu Jun 15 16:38:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

On the Target:
Linux iscsi1 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020 x86_64 x86_64 
x86_64 GNU/Linux

> 
> If you using rpm do:
> # rpm -q ceph-iscsi
> # rpm -q tcmu-runner
> # rpm -q python-rtslib
> 
No, I've installed them from source on the target
> To map that to an iscsi gateway then you can do the following.
> 
> If sdb is the AO one, then run
> 
> iscsiadm -m session -P 3
> 
> Here you can see the sdXYZ name to iscsi session mapping. The iscsi
> session/connection's target IP address from that command should match to
> the gateway that is listed as the "owner" of the LUN in the "gwcli ls"
> output.

I see... thanks for the hint.

I've done a test: I've unmapped all the drive, then mapped the first gateway (iscsi1) on all the 
nodes, waited, then mapped the second gateway, to be sure that all the nodes would see the 
first node as the active/master one.
Now things seems a little better in "normal" vm use: I only see the "Cannot send after 
transport endpoint shutdown." on the secondary target node.


I do see some hopping between the nodes when importing a disk drive, but at this point I'm 
starting to suspect some strange activity from the Xen infrastructure in that circumstance.

-- 
*Simone Lazzaris*
*Qcom S.p.A. a socio unico*
[root@iscsi1 ~]# gwcli ls
o- / ......................................................................................................................... [...]
  o- cluster ......................................................................................................... [Clusters: 1]
  | o- ceph ............................................................................................................ [HEALTH_OK]
  |   o- pools ......................................................................................................... [Pools: 14]
  |   | o- .rgw.root ............................................................ [(x3), Commit: 0.00Y/7845410304K (0%), Used: 768K]
  |   | o- cephfs_data ......................................................... [(x3), Commit: 0.00Y/7845410304K (0%), Used: 0.00Y]
  |   | o- cephfs_filedata ................................................. [(2+1), Commit: 0.00Y/15323067M (0%), Used: 554976896K]
  |   | o- cephfs_metadata ................................................ [(x3), Commit: 0.00Y/7845410304K (0%), Used: 713001348b]
  |   | o- default.rgw.buckets.data ...................................... [(2+1), Commit: 0.00Y/15323067M (0%), Used: 70695603904K]
  |   | o- default.rgw.buckets.index ..................................... [(x3), Commit: 0.00Y/7845410304K (0%), Used: 2777605944b]
  |   | o- default.rgw.buckets.non-ec .......................................... [(x3), Commit: 0.00Y/7845410304K (0%), Used: 0.00Y]
  |   | o- default.rgw.control ................................................. [(x3), Commit: 0.00Y/7845410304K (0%), Used: 0.00Y]
  |   | o- default.rgw.log .................................................. [(x3), Commit: 0.00Y/7845410304K (0%), Used: 1268184b]
  |   | o- default.rgw.meta ................................................ [(x3), Commit: 0.00Y/7845410304K (0%), Used: 18291455b]
  |   | o- provetta .............................................................. [(2+1), Commit: 0.00Y/15323067M (0%), Used: 192K]
  |   | o- rbd .................................................................. [(x3), Commit: 0.00Y/7845410304K (0%), Used: 192K]
  |   | o- rbdindex0 .................................................. [(x3), Commit: 3.0T/7845410304K (41%), Used: 6309981651784b]
  |   | o- rbdstore0 ...................................................... [(2+1), Commit: 0.00Y/15323067M (0%), Used: 8946965632K]
  |   o- topology ............................................................................................... [OSDs: 35,MONs: 3]
  o- disks ........................................................................................................ [3.0T, Disks: 1]
  | o- rbdindex0 ................................................................................................ [rbdindex0 (3.0T)]
  |   o- scsidisk0 .................................................................................... [rbdindex0/scsidisk0 (3.0T)]
  o- iscsi-targets ............................................................................... [DiscoveryAuth: None, Targets: 1]
    o- iqn.2020-04.it.qcom.iscsi-gw:iscsi-gw ............................................................. [Auth: None, Gateways: 2]
      o- disks .......................................................................................................... [Disks: 1]
      | o- rbdindex0/scsidisk0 ............................................................. [Owner: iscsi1.ceph.interac.it, Lun: 0]
      o- gateways ............................................................................................ [Up: 2/2, Portals: 2]
      | o- iscsi1.ceph.interac.it ........................................................................... [192.168.128.148 (UP)]
      | o- iscsi2.ceph.interac.it ........................................................................... [192.168.128.149 (UP)]
      o- host-groups .................................................................................................. [Groups : 1]
      | o- virtualfarmn ....................................................................................... [Hosts: 6, Disks: 1]
      |   o- iqn.1994-05.com.redhat:28f014226f3 ............................................................................. [host]
      |   o- iqn.1994-05.com.redhat:3b7486ce4061 ............................................................................ [host]
      |   o- iqn.1994-05.com.redhat:4fe8c962b60 ............................................................................. [host]
      |   o- iqn.1994-05.com.redhat:7518fc83636f ............................................................................ [host]
      |   o- iqn.1994-05.com.redhat:97c35ecf9625 ............................................................................ [host]
      |   o- iqn.1994-05.com.redhat:a7e8c81bcc6 ............................................................................. [host]
      |   o- rbdindex0/scsidisk0 ............................................................................................ [disk]
      o- hosts ....................................................................................... [Auth: ACL_ENABLED, Hosts: 7]
        o- iqn.1994-05.com.redhat:4fe8c962b60 .............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1994-05.com.redhat:28f014226f3 .............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1994-05.com.redhat:3b7486ce4061 ............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1994-05.com.redhat:97c35ecf9625 ............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1994-05.com.redhat:a7e8c81bcc6 .............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1994-05.com.redhat:7518fc83636f ............................................. [LOGGED-IN, Auth: CHAP, Disks: 1(3.0T)]
        | o- lun 0 ...................................................... [rbdindex0/scsidisk0(3.0T), Owner: iscsi1.ceph.interac.it]
        o- iqn.1993-08.org.debian:01:77aee8e76d54 .................................................... [Auth: CHAP, Disks: 0(0.00Y)]
[root@iscsi1 ~]# 
[root@xs-n1 log]# iscsiadm -m session -P 3
iSCSI Transport Class version 2.0-870
version 6.2.0.873-30
Target: iqn.2020-04.it.qcom.iscsi-gw:iscsi-gw (non-flash)
        Current Portal: 192.168.128.148:3260,1
        Persistent Portal: 192.168.128.148:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:4fe8c962b60
                Iface IPaddress: 192.168.130.170
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 25
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: virtualfarmn
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 262144
                FirstBurstLength: 262144
                MaxBurstLength: 524288
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 2  State: running
                scsi2 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc          State: running
        Current Portal: 192.168.128.149:3260,2
        Persistent Portal: 192.168.128.149:3260,2
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1994-05.com.redhat:4fe8c962b60
                Iface IPaddress: 192.168.130.170
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 3
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 25
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: virtualfarmn
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 262144
                FirstBurstLength: 262144
                MaxBurstLength: 524288
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 3  State: running
                scsi3 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running
[root@xs-n1 log]# 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux