Re: ceph-iscsi upgrade issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,
Thanks for your prompt responses 

I have used same iscsi-gateway.cfg file - no security changes - just added prometheus entry
There is no iscsi-gateway.conf but the gateway.conf object is created and has correct entries

iscsi-gateway.cfg is identical and contains the following

[config]
cluster_name = ceph
gateway_keyring = ceph.client.admin.keyring
api_secure = false
trusted_ip_list = 10.10.30.181,10.10.30.182,10.10.30.183,10.10.30.184,10.10.30.185,10.10.30.186
prometheus_host = 0.0.0.0



I am running the disks commands from OSD01 and they fail with the following

INFO [gateway.py:344:load_config()] - (Gateway.load_config) successfully loaded existing target definition
2018-10-10 09:04:48,956    DEBUG [gateway.py:423:map_luns()] - processing tpg2
2018-10-10 09:04:48,956    DEBUG [gateway.py:428:map_luns()] - rbd.dstest needed mapping to tpg2
2018-10-10 09:04:48,958     INFO [gateway.py:403:bind_alua_group_to_lun()] - Setup group ao for rbd.dstest on tpg 2 (state 0, owner True, failover type 1)
2018-10-10 09:04:48,958    DEBUG [gateway.py:405:bind_alua_group_to_lun()] - Setting Luns tg_pt_gp to ao
2018-10-10 09:04:48,959    DEBUG [gateway.py:409:bind_alua_group_to_lun()] - Bound rbd.dstest on tpg2 to ao
2018-10-10 09:04:48,959    DEBUG [gateway.py:423:map_luns()] - processing tpg1
2018-10-10 09:04:48,959    DEBUG [gateway.py:428:map_luns()] - rbd.dstest needed mapping to tpg1
2018-10-10 09:04:48,960     INFO [gateway.py:403:bind_alua_group_to_lun()] - Setup group ano1 for rbd.dstest on tpg 1 (state 1, owner False, failover type 1)
2018-10-10 09:04:48,960    DEBUG [gateway.py:405:bind_alua_group_to_lun()] - Setting Luns tg_pt_gp to ano1
2018-10-10 09:04:48,961    DEBUG [gateway.py:409:bind_alua_group_to_lun()] - Bound rbd.dstest on tpg1 to ano1
2018-10-10 09:04:48,963     INFO [_internal.py:87:_log()] - 127.0.0.1 - - [10/Oct/2018 09:04:48] "PUT /api/_disk/rbd.dstest HTTP/1.1" 200 -
2018-10-10 09:04:48,965     INFO [rbd-target-api:1804:call_api()] - _disk update on 127.0.0.1, successful
2018-10-10 09:04:48,965    DEBUG [rbd-target-api:1789:call_api()] - processing GW 'osd03'
2018-10-10 09:04:49,039    ERROR [rbd-target-api:1810:call_api()] - _disk change on osd03 failed with 500
2018-10-10 09:04:49,041     INFO [_internal.py:87:_log()] - 127.0.0.1 - - [10/Oct/2018 09:04:49] "PUT /api/disk/rbd.dstest HTTP/1.1" 500 -


on OSD03 there is the folowing "error"

 INFO [lun.py:656:add_dev_to_lio()] - (LUN.add_dev_to_lio) Adding image 'rbd.dstest' to LIO
2018-10-10 09:04:49,037    DEBUG [lun.py:666:add_dev_to_lio()] - control="max_data_area_mb=8"

Amazingly enough, gwcli on OSD03 show the disk created but on OSD01 it does not 
If I restart gwcli on OSD01 , disk is there but it cannot be added to the host because it image does not exist ???


adding the disk to the hosts failed  with "client masking update" error 

disk add rbd.dstest
CMD: ../hosts/<client_iqn> disk action="" disk=rbd.dstest
Client 'iqn.1998-01.com.vmware:test-2d06960a' update - add disk rbd.dstest
disk add for 'rbd.dstest' against iqn.1998-01.com.vmware:test-2d06960a failed
client masking update failed on osd03. Client update failed

rbd-target-api:1216:_update_client()] - client update failed on iqn.1998-01.com.vmware:test-2d06960a : Non-existent images ['rbd.dstest'] requested for iqn.1998-01.com.vmware:test-2d06960a

However, the image is listed on gwcli and using rados ls 

/disks> ls
o- disks .......................................................................................................... [150G, Disks: 1]
  o- rbd.dstest .................................................................................................... [dstest (150G)]

rados -p rbd ls | grep dstest
rbd_id.dstest



I would really appreciate any help / suggestions

Thanks
Steven 

On Tue, 9 Oct 2018 at 16:35, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
Anything in the rbd-target-api.log on osd03 to indicate why it failed?

Since you replaced your existing "iscsi-gateway.conf", do your
security settings still match between the two hosts (i.e. on the
trusted_ip_list, same api_XYZ options)?
On Tue, Oct 9, 2018 at 4:25 PM Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
>
> so the gateways are up but I have issues adding disks ( i.e. if I do it on one gatway it does not show on the other - however, after I restart the rbd-target services I am seeing the disks )
> Thanks in advance for taking the trouble to provide advice / guidance
>
> 2018-10-09 16:16:08,968     INFO [rbd-target-api:1804:call_api()] - _clientlun update on 127.0.0.1, successful
> 2018-10-09 16:16:08,968    DEBUG [rbd-target-api:1789:call_api()] - processing GW 'osd03'
> 2018-10-09 16:16:08,987    ERROR [rbd-target-api:1810:call_api()] - _clientlun change on osd03 failed with 500
> 2018-10-09 16:16:08,987    DEBUG [rbd-target-api:1827:call_api()] - failed on osd03, applied to 127.0.0.1, aborted osd03. Client update failed
> 2018-10-09 16:16:08,987     INFO [_internal.py:87:_log()] - 127.0.0.1 - - [09/Oct/2018 16:16:08] "PUT /api/clientlun/iqn.1998-01.com.vmware:test-2d06960a HTTP/1.1" 500 -
>
> On Tue, 9 Oct 2018 at 15:42, Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
>>
>> It worked.
>>
>> many thanks
>> Steven
>>
>> On Tue, 9 Oct 2018 at 15:36, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>>
>>> Can you try applying [1] and see if that resolves your issue?
>>>
>>> [1] https://github.com/ceph/ceph-iscsi-config/pull/78
>>> On Tue, Oct 9, 2018 at 3:06 PM Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
>>> >
>>> > Thanks Jason
>>> >
>>> > adding prometheus_host = 0.0.0.0 to iscsi-gateway.cfg does not work - the error message is
>>> >
>>> > "..rbd-target-gw: ValueError: invalid literal for int() with base 10: '0.0.0.0' "
>>> >
>>> > adding prometheus_exporter = false works
>>> >
>>> > However I'd like to use prometheus_exporter if possible
>>> > Any suggestions will be appreciated
>>> >
>>> > Steven
>>> >
>>> >
>>> >
>>> > On Tue, 9 Oct 2018 at 14:25, Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>>> >>
>>> >> You can try adding "prometheus_exporter = false" in your
>>> >> "/etc/ceph/iscsi-gateway.cfg"'s "config" section if you aren't using
>>> >> "cephmetrics", or try setting "prometheus_host = 0.0.0.0" since it
>>> >> sounds like you have the IPv6 stack disabled.
>>> >>
>>> >> [1] https://github.com/ceph/ceph-iscsi-config/blob/master/ceph_iscsi_config/settings.py#L90
>>> >> On Tue, Oct 9, 2018 at 2:09 PM Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
>>> >> >
>>> >> > here is some info from /var/log/messages ..in case someone has the time to take a look
>>> >> >
>>> >> > Oct  9 13:58:35 osd03 systemd: Started Setup system to export rbd images through LIO.
>>> >> > Oct  9 13:58:35 osd03 systemd: Starting Setup system to export rbd images through LIO...
>>> >> > Oct  9 13:58:35 osd03 journal: Processing osd blacklist entries for this node
>>> >> > Oct  9 13:58:35 osd03 journal: No OSD blacklist entries found
>>> >> > Oct  9 13:58:35 osd03 journal: Reading the configuration object to update local LIO configuration
>>> >> > Oct  9 13:58:35 osd03 journal: Configuration does not have an entry for this host(osd03) - nothing to define to LIO
>>> >> > Oct  9 13:58:35 osd03 journal: Integrated Prometheus exporter is enabled
>>> >> > Oct  9 13:58:35 osd03 journal: * Running on http://[::]:9287/
>>> >> > Oct  9 13:58:35 osd03 journal: Removing iSCSI target from LIO
>>> >> > Oct  9 13:58:35 osd03 journal: Removing LUNs from LIO
>>> >> > Oct  9 13:58:35 osd03 journal: Active Ceph iSCSI gateway configuration removed
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: Traceback (most recent call last):
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/bin/rbd-target-gw", line 5, in <module>
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: pkg_resources.run_script('ceph-iscsi-config==2.6', 'rbd-target-gw')
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in run_script
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: self.require(requires)[0].run_script(script_name, ns)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1462, in run_script
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: exec_(script_code, namespace, namespace)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 41, in exec_
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: exec("""exec code in globs, locs""")
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "<string>", line 1, in <module>
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw", line 432, in <module>
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/ceph_iscsi_config-2.6-py2.7.egg/EGG-INFO/scripts/rbd-target-gw", line 379, in main
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/flask/app.py", line 772, in run
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: run_simple(host, port, self, **options)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 710, in run_simple
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: inner()
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 692, in inner
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: passthrough_errors, ssl_context).serve_forever()
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 480, in make_server
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: passthrough_errors, ssl_context)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib/python2.7/site-packages/werkzeug/serving.py", line 410, in __init__
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: HTTPServer.__init__(self, (host, int(port)), handler)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib64/python2.7/SocketServer.py", line 417, in __init__
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: self.socket_type)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: File "/usr/lib64/python2.7/socket.py", line 187, in __init__
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: _sock = _realsocket(family, type, proto)
>>> >> > Oct  9 13:58:35 osd03 rbd-target-gw: socket.error: [Errno 97] Address family not supported by protocol
>>> >> > Oct  9 13:58:35 osd03 systemd: rbd-target-gw.service: main process exited, code=exited, status=1/FAILURE
>>> >> >
>>> >> >
>>> >> > On Tue, 9 Oct 2018 at 13:16, Steven Vacaroaia <stef97@xxxxxxxxx> wrote:
>>> >> >>
>>> >> >> Hi ,
>>> >> >> I am using Mimic 13.2 and kernel 4.18
>>> >> >> Was using gwcli 2.5 and decided to upgrade to latest (2.7) as people reported improved performance
>>> >> >>
>>> >> >> What is the proper methodology ?
>>> >> >> How should I troubleshoot this?
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> What I did ( and it broke it) was
>>> >> >>
>>> >> >> cd tcmu-runner; git pull ; make && make install
>>> >> >> cd ceph-iscsi-cli; git pull;python setup.py install
>>> >> >> cd ceph-iscsi-config;git pull; python setup.py install
>>> >> >> cd rtslib-fb;git pull;  python setup.py install
>>> >> >>
>>> >> >> After a reboot, I cannot start rbd-target-gw and the logs are not very helpful
>>> >> >>  ( Note:
>>> >> >>     I removed /etc/ceph/iscsi-gateway.cfg and gateway.conf object as I wanted to start fresh
>>> >> >>      /etc/ceph/iscsi-gatway.conf was left unchanged )
>>> >> >>
>>> >> >>
>>> >> >> 2018-10-09 12:47:50,593 [    INFO] - Processing osd blacklist entries for this node
>>> >> >> 2018-10-09 12:47:50,893 [    INFO] - No OSD blacklist entries found
>>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Reading the configuration object to update local LIO configuration
>>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Configuration does not have an entry for this host(osd03) - nothing to define to LIO
>>> >> >> 2018-10-09 12:47:50,893 [    INFO] - Integrated Prometheus exporter is enabled
>>> >> >> 2018-10-09 12:47:50,895 [    INFO] -  * Running on http://[::]:9287/
>>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Removing iSCSI target from LIO
>>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Removing LUNs from LIO
>>> >> >> 2018-10-09 12:47:50,896 [    INFO] - Active Ceph iSCSI gateway configuration removed
>>> >> >>
>>> >> >> Many thanks
>>> >> >> Steven
>>> >> >>
>>> >> > _______________________________________________
>>> >> > ceph-users mailing list
>>> >> > ceph-users@xxxxxxxxxxxxxx
>>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jason
>>>
>>>
>>>
>>> --
>>> Jason



--
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux