On 10/10/2018 08:21 AM, Steven Vacaroaia wrote: > Hi Jason, > Thanks for your prompt responses > > I have used same iscsi-gateway.cfg file - no security changes - just > added prometheus entry > There is no iscsi-gateway.conf but the gateway.conf object is created > and has correct entries > > iscsi-gateway.cfg is identical and contains the following > > [config] > cluster_name = ceph > gateway_keyring = ceph.client.admin.keyring > api_secure = false > trusted_ip_list = > 10.10.30.181,10.10.30.182,10.10.30.183,10.10.30.184,10.10.30.185,10.10.30.186 > prometheus_host = 0.0.0.0 > > > > I am running the disks commands from OSD01 and they fail with the following > > INFO [gateway.py:344:load_config()] - (Gateway.load_config) successfully > loaded existing target definition > 2018-10-10 09:04:48,956 DEBUG [gateway.py:423:map_luns()] - > processing tpg2 > 2018-10-10 09:04:48,956 DEBUG [gateway.py:428:map_luns()] - > rbd.dstest needed mapping to tpg2 > 2018-10-10 09:04:48,958 INFO > [gateway.py:403:bind_alua_group_to_lun()] - Setup group ao for > rbd.dstest on tpg 2 (state 0, owner True, failover type 1) > 2018-10-10 09:04:48,958 DEBUG > [gateway.py:405:bind_alua_group_to_lun()] - Setting Luns tg_pt_gp to ao > 2018-10-10 09:04:48,959 DEBUG > [gateway.py:409:bind_alua_group_to_lun()] - Bound rbd.dstest on tpg2 to ao > 2018-10-10 09:04:48,959 DEBUG [gateway.py:423:map_luns()] - > processing tpg1 > 2018-10-10 09:04:48,959 DEBUG [gateway.py:428:map_luns()] - > rbd.dstest needed mapping to tpg1 > 2018-10-10 09:04:48,960 INFO > [gateway.py:403:bind_alua_group_to_lun()] - Setup group ano1 for > rbd.dstest on tpg 1 (state 1, owner False, failover type 1) > 2018-10-10 09:04:48,960 DEBUG > [gateway.py:405:bind_alua_group_to_lun()] - Setting Luns tg_pt_gp to ano1 > 2018-10-10 09:04:48,961 DEBUG > [gateway.py:409:bind_alua_group_to_lun()] - Bound rbd.dstest on tpg1 to ano1 > 2018-10-10 09:04:48,963 INFO [_internal.py:87:_log()] - 127.0.0.1 - > - [10/Oct/2018 09:04:48] "PUT /api/_disk/rbd.dstest HTTP/1.1" 200 - > 2018-10-10 09:04:48,965 INFO [rbd-target-api:1804:call_api()] - > _disk update on 127.0.0.1, successful > 2018-10-10 09:04:48,965 DEBUG [rbd-target-api:1789:call_api()] - > processing GW 'osd03' > 2018-10-10 09:04:49,039 ERROR [rbd-target-api:1810:call_api()] - > _disk change on osd03 failed with 500 > 2018-10-10 09:04:49,041 INFO [_internal.py:87:_log()] - 127.0.0.1 - > - [10/Oct/2018 09:04:49] "PUT /api/disk/rbd.dstest HTTP/1.1" 500 - > > > on OSD03 there is the folowing "error" > > INFO [lun.py:656:add_dev_to_lio()] - (LUN.add_dev_to_lio) Adding image > 'rbd.dstest' to LIO > 2018-10-10 09:04:49,037 DEBUG [lun.py:666:add_dev_to_lio()] - > control="max_data_area_mb=8" > > Amazingly enough, gwcli on OSD03 show the disk created but on OSD01 it > does not > If I restart gwcli on OSD01 , disk is there but it cannot be added to > the host because it image does not exist ??? What is the output of systemctl status rbd-target-api systemctl status rbd-target-gw Is api in a failed state or does it indicate it has been crashing and restarting? Does /var/log/messages show that rbd-target-api is crashing and restarting and could you attach the stack trace? The /var/log/rbd-target-api log will show Does gwcli ls show it cannot reach the remote gateways? > > adding the disk to the hosts failed with "client masking update" error > > disk add rbd.dstest > CMD: ../hosts/<client_iqn> disk action=add disk=rbd.dstest > Client 'iqn.1998-01.com.vmware:test-2d06960a' update - add disk rbd.dstest > disk add for 'rbd.dstest' against iqn.1998-01.com.vmware:test-2d06960a > failed > client masking update failed on osd03. Client update failed > > rbd-target-api:1216:_update_client()] - client update failed on > iqn.1998-01.com.vmware:test-2d06960a : Non-existent images > ['rbd.dstest'] requested for iqn.1998-01.com.vmware:test-2d06960a > > However, the image is listed on gwcli and using rados ls > > /disks> ls > o- disks > .......................................................................................................... > [150G, Disks: 1] > o- rbd.dstest > .................................................................................................... > [dstest (150G)] > > rados -p rbd ls | grep dstest > rbd_id.dstest > > > > I would really appreciate any help / suggestions > > Thanks > Steven > > On Tue, 9 Oct 2018 at 16:35, Jason Dillaman <jdillama@xxxxxxxxxx > <mailto:jdillama@xxxxxxxxxx>> wrote: > > Anything in the rbd-target-api.log on osd03 to indicate why it failed? > > Since you replaced your existing "iscsi-gateway.conf", do your > security settings still match between the two hosts (i.e. on the > trusted_ip_list, same api_XYZ options)? > On Tue, Oct 9, 2018 at 4:25 PM Steven Vacaroaia <stef97@xxxxxxxxx > <mailto:stef97@xxxxxxxxx>> wrote: > > > > so the gateways are up but I have issues adding disks ( i.e. if I > do it on one gatway it does not show on the other - however, after I > restart the rbd-target services I am seeing the disks ) > > Thanks in advance for taking the trouble to provide advice / guidance > > > > 2018-10-09 16:16:08,968 INFO [rbd-target-api:1804:call_api()] > - _clientlun update on 127.0.0.1, successful > > 2018-10-09 16:16:08,968 DEBUG [rbd-target-api:1789:call_api()] > - processing GW 'osd03' > > 2018-10-09 16:16:08,987 ERROR [rbd-target-api:1810:call_api()] > - _clientlun change on osd03 failed with 500 > > 2018-10-09 16:16:08,987 DEBUG [rbd-target-api:1827:call_api()] > - failed on osd03, applied to 127.0.0.1, aborted osd03. Client > update failed > > 2018-10-09 16:16:08,987 INFO [_internal.py:87:_log()] - > 127.0.0.1 - - [09/Oct/2018 16:16:08] "PUT > /api/clientlun/iqn.1998-01.com.vmware:test-2d06960a HTTP/1.1" 500 - > > > > On Tue, 9 Oct 2018 at 15:42, Steven Vacaroaia <stef97@xxxxxxxxx > <mailto:stef97@xxxxxxxxx>> wrote: > >> > >> It worked. > >> > >> many thanks > >> Steven > >> > >> On Tue, 9 Oct 2018 at 15:36, Jason Dillaman <jdillama@xxxxxxxxxx > <mailto:jdillama@xxxxxxxxxx>> wrote: > >>> > >>> Can you try applying [1] and see if that resolves your issue? > >>> > >>> [1] https://github.com/ceph/ceph-iscsi-config/pull/78 > >>> On Tue, Oct 9, 2018 at 3:06 PM Steven Vacaroaia > <stef97@xxxxxxxxx <mailto:stef97@xxxxxxxxx>> wrote: > >>> > > >>> > Thanks Jason > >>> > > >>> > adding prometheus_host = 0.0.0.0 to iscsi-gateway.cfg does not > work - the error message is > >>> > > >>> > "..rbd-target-gw: ValueError: invalid literal for int() with > base 10: '0.0.0.0' " > >>> > > >>> > adding prometheus_exporter = false works > >>> > > >>> > However I'd like to use prometheus_exporter if possible > >>> > Any suggestions will be appreciated > >>> > > >>> > Steven > >>> > > >>> > > >>> > > >>> > On Tue, 9 Oct 2018 at 14:25, Jason Dillaman > <jdillama@xxxxxxxxxx <mailto:jdillama@xxxxxxxxxx>> wrote: > >>> >> > >>> >> You can try adding "prometheus_exporter = false" in your > >>> >> "/etc/ceph/iscsi-gateway.cfg"'s "config" section if you > aren't using > >>> >> "cephmetrics", or try setting "prometheus_host = 0.0.0.0" > since it > >>> >> sounds like you have the IPv6 stack disabled. > >>> >> > >>> >> [1] > https://github.com/ceph/ceph-iscsi-config/blob/master/ceph_iscsi_config/settings.py#L90 > >>> >> On Tue, Oct 9, 2018 at 2:09 PM Steven Vacaroaia > <stef97@xxxxxxxxx <mailto:stef97@xxxxxxxxx>> wrote: > >>> >> > > >>> >> > here is some info from /var/log/messages ..in case someone > has the time to take a look > >>> >> > > >>> >> > Oct 9 13:58:35 osd03 systemd: Started Setup system to > export rbd images through LIO. > >>> >> > Oct 9 13:58:35 osd03 systemd: Starting Setup system to > export rbd images through LIO... > >>> >> > Oct 9 13:58:35 osd03 journal: Processing osd blacklist > entries for this node > >>> >> > Oct 9 13:58:35 osd03 journal: No OSD blacklist entries found > >>> >> > Oct 9 13:58:35 osd03 journal: Reading the configuration > object to update local LIO configuration > >>> >> > Oct 9 13:58:35 osd03 journal: Configuration does not have > an entry for this host(osd03) - nothing to define to LIO > >>> >> > Oct 9 13:58:35 osd03 journal: Integrated Prometheus > exporter is enabled > >>> >> > Oct 9 13:58:35 osd03 journal: * Running on http://[::]:9287/ > >>> >> > Oct 9 13:58:35 osd03 journal: Removing iSCSI target from LIO > >>> >> > Oct 9 13:58:35 osd03 journal: Removing LUNs from LIO > >>> >> > Oct 9 13:58:35 osd03 journal: Active Ceph iSCSI gateway > configuration removed > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: Traceback (most recent > call last): > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/bin/rbd-target-gw", line 5, in <module> > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: > pkg_resources.run_script('ceph-iscsi-config==2.6', 'rbd-target-gw') > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in > run_script > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: > self.require(requires)[0].run_script(script_name, ns) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1462, in > run_script > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: exec_(script_code, > namespace, namespace) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/pkg_resources.py", line 41, in exec_ > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: exec("""exec code in > globs, locs""") > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File "<string>", line > 1, in <module> > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw", > line 432, in <module> > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw", > line 379, in main > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/flask/app.py", line 772, in run > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: run_simple(host, port, > self, **options) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 710, in > run_simple > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: inner() > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 692, in > inner > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: passthrough_errors, > ssl_context).serve_forever() > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 480, in > make_server > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: passthrough_errors, > ssl_context) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 410, in > __init__ > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: > HTTPServer.__init__(self, (host, int(port)), handler) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib64/python2.7/SocketServer.py", line 417, in __init__ > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: self.socket_type) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: File > "/usr/lib64/python2.7/socket.py", line 187, in __init__ > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: _sock = > _realsocket(family, type, proto) > >>> >> > Oct 9 13:58:35 osd03 rbd-target-gw: socket.error: [Errno > 97] Address family not supported by protocol > >>> >> > Oct 9 13:58:35 osd03 systemd: rbd-target-gw.service: main > process exited, code=exited, status=1/FAILURE > >>> >> > > >>> >> > > >>> >> > On Tue, 9 Oct 2018 at 13:16, Steven Vacaroaia > <stef97@xxxxxxxxx <mailto:stef97@xxxxxxxxx>> wrote: > >>> >> >> > >>> >> >> Hi , > >>> >> >> I am using Mimic 13.2 and kernel 4.18 > >>> >> >> Was using gwcli 2.5 and decided to upgrade to latest (2.7) > as people reported improved performance > >>> >> >> > >>> >> >> What is the proper methodology ? > >>> >> >> How should I troubleshoot this? > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> What I did ( and it broke it) was > >>> >> >> > >>> >> >> cd tcmu-runner; git pull ; make && make install > >>> >> >> cd ceph-iscsi-cli; git pull;python setup.py install > >>> >> >> cd ceph-iscsi-config;git pull; python setup.py install > >>> >> >> cd rtslib-fb;git pull; python setup.py install > >>> >> >> > >>> >> >> After a reboot, I cannot start rbd-target-gw and the logs > are not very helpful > >>> >> >> ( Note: > >>> >> >> I removed /etc/ceph/iscsi-gateway.cfg and gateway.conf > object as I wanted to start fresh > >>> >> >> /etc/ceph/iscsi-gatway.conf was left unchanged ) > >>> >> >> > >>> >> >> > >>> >> >> 2018-10-09 12:47:50,593 [ INFO] - Processing osd > blacklist entries for this node > >>> >> >> 2018-10-09 12:47:50,893 [ INFO] - No OSD blacklist > entries found > >>> >> >> 2018-10-09 12:47:50,893 [ INFO] - Reading the > configuration object to update local LIO configuration > >>> >> >> 2018-10-09 12:47:50,893 [ INFO] - Configuration does > not have an entry for this host(osd03) - nothing to define to LIO > >>> >> >> 2018-10-09 12:47:50,893 [ INFO] - Integrated Prometheus > exporter is enabled > >>> >> >> 2018-10-09 12:47:50,895 [ INFO] - * Running on > http://[::]:9287/ > >>> >> >> 2018-10-09 12:47:50,896 [ INFO] - Removing iSCSI target > from LIO > >>> >> >> 2018-10-09 12:47:50,896 [ INFO] - Removing LUNs from LIO > >>> >> >> 2018-10-09 12:47:50,896 [ INFO] - Active Ceph iSCSI > gateway configuration removed > >>> >> >> > >>> >> >> Many thanks > >>> >> >> Steven > >>> >> >> > >>> >> > _______________________________________________ > >>> >> > ceph-users mailing list > >>> >> > ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx> > >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> Jason > >>> > >>> > >>> > >>> -- > >>> Jason > > > > -- > Jason > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com