Re: RGW 10.2.5->10.2.7 authentication fail?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Łukasz,

Thanks for your testing and sorry for my mistake. It looks that two commits
need to be reverted to get the previous behaviour:

The already mentioned one:
  https://github.com/ceph/ceph/commit/c9445faf7fac2ccb8a05b53152c0ca16d7f4c6d0
Its dependency:
  https://github.com/ceph/ceph/commit/b72fc1b820ede3cd186d887d9d30f7f91fe3764b

They have been merged in the same pull request:
  https://github.com/ceph/ceph/pull/11760
and form the difference visible between v10.2.5 and v10.2.6 in the matter
of "in_hosted_domain" handling:
  https://github.com/ceph/ceph/blame/v10.2.5/src/rgw/rgw_rest.cc#L1773
  https://github.com/ceph/ceph/blame/v10.2.6/src/rgw/rgw_rest.cc#L1781-L1782

I'm really not sure we want to revert them. Still, it can be that they just
unhide a misconfiguration issue while fixing the problems we had with
handling of virtual hosted buckets.

Regards,
Radek

On Wed, May 3, 2017 at 3:12 AM, Łukasz Jagiełło
<jagiello.lukasz@xxxxxxxxx> wrote:
> Hi,
>
> I tried today revert [1] from 10.2.7 but the problem is still there even
> without the change. Revert to 10.2.5 fix the issue instantly.
>
> https://github.com/ceph/ceph/commit/c9445faf7fac2ccb8a05b53152c0ca16d7f4c6d0
>
> On Thu, Apr 27, 2017 at 4:53 AM, Radoslaw Zarzynski
> <rzarzynski@xxxxxxxxxxxx> wrote:
>>
>> Bingo! From the 10.2.5-admin:
>>
>>   GET
>>
>>   Thu, 27 Apr 2017 07:49:59 GMT
>>   /
>>
>> And also:
>>
>>   2017-04-27 09:49:59.117447 7f4a90ff9700 20 subdomain= domain=
>> in_hosted_domain=0 in_hosted_domain_s3website=0
>>   2017-04-27 09:49:59.117449 7f4a90ff9700 20 final domain/bucket
>> subdomain= domain= in_hosted_domain=0 in_hosted_domain_s3website=0
>> s->info.domain= s->info.request_uri=/
>>
>> The most interesting part is the "final ... in_hosted_domain=0".
>> It looks we need to dig around RGWREST::preprocess(),
>> rgw_find_host_in_domains() & company.
>>
>> There is a commit introduced in v10.2.6 that touches this area [1].
>> I'm definitely not saying it's the root cause. It might be that a change
>> in the code just unhidden a configuration issue [2].
>>
>> I will talk about the problem on the today's sync-up.
>>
>> Thanks for the logs!
>> Regards,
>> Radek
>>
>> [1]
>> https://github.com/ceph/ceph/commit/c9445faf7fac2ccb8a05b53152c0ca16d7f4c6d0
>> [2] http://tracker.ceph.com/issues/17440
>>
>> On Thu, Apr 27, 2017 at 10:11 AM, Ben Morrice <ben.morrice@xxxxxxx> wrote:
>> > Hello Radek,
>> >
>> > Thank-you for your analysis so far! Please find attached logs for both
>> > the
>> > admin user and a keystone backed user from 10.2.5 (same host as before,
>> > I
>> > have simply downgraded the packages). Both users can authenticate and
>> > list
>> > buckets on 10.2.5.
>> >
>> > Also - I tried version 10.2.6 and see the same behavior as 10.2.7, so
>> > the
>> > bug i'm hitting looks like it was introduced in 10.2.6
>> >
>> > Kind regards,
>> >
>> > Ben Morrice
>> >
>> > ______________________________________________________________________
>> > Ben Morrice | e: ben.morrice@xxxxxxx | t: +41-21-693-9670
>> > EPFL / BBP
>> > Biotech Campus
>> > Chemin des Mines 9
>> > 1202 Geneva
>> > Switzerland
>> >
>> > On 27/04/17 04:45, Radoslaw Zarzynski wrote:
>> >>
>> >> Thanks for the logs, Ben.
>> >>
>> >> It looks that two completely different authenticators have failed:
>> >> the local, RADOS-backed auth (admin.txt) and Keystone-based
>> >> one as well. In the second case I'm pretty sure that Keystone has
>> >> rejected [1][2] to authenticate provided signature/StringToSign.
>> >> RGW tried to fallback to the local auth which obviously didn't have
>> >> any chance as the credentials were stored remotely. This explains
>> >> the presence of "error reading user info" in the user-keystone.txt.
>> >>
>> >> What is common for both scenarios are the low-level things related
>> >> to StringToSign crafting/signature generation at RadosGW's side.
>> >> Following one has been composed for the request from admin.txt:
>> >>
>> >>    GET
>> >>
>> >>
>> >>    Wed, 26 Apr 2017 09:18:42 GMT
>> >>    /bbpsrvc15.cscs.ch/
>> >>
>> >> If you could provide a similar log from v10.2.5, I would be really
>> >> grateful.
>> >>
>> >> Regards,
>> >> Radek
>> >>
>> >> [1]
>> >>
>> >> https://github.com/ceph/ceph/blob/v10.2.7/src/rgw/rgw_rest_s3.cc#L3269-L3272
>> >> [2] https://github.com/ceph/ceph/blob/v10.2.7/src/rgw/rgw_common.h#L170
>> >>
>> >> On Wed, Apr 26, 2017 at 11:29 AM, Morrice Ben <ben.morrice@xxxxxxx>
>> >> wrote:
>> >>>
>> >>> Hello Radek,
>> >>>
>> >>> Please find attached the failed request for both the admin user and a
>> >>> standard user (backed by keystone).
>> >>>
>> >>> Kind regards,
>> >>>
>> >>> Ben Morrice
>> >>>
>> >>> ______________________________________________________________________
>> >>> Ben Morrice | e: ben.morrice@xxxxxxx | t: +41-21-693-9670
>> >>> EPFL BBP
>> >>> Biotech Campus
>> >>> Chemin des Mines 9
>> >>> 1202 Geneva
>> >>> Switzerland
>> >>>
>> >>> ________________________________________
>> >>> From: Radoslaw Zarzynski <rzarzynski@xxxxxxxxxxxx>
>> >>> Sent: Tuesday, April 25, 2017 7:38 PM
>> >>> To: Morrice Ben
>> >>> Cc: ceph-users@xxxxxxxxxxxxxx
>> >>> Subject: Re:  RGW 10.2.5->10.2.7 authentication fail?
>> >>>
>> >>> Hello Ben,
>> >>>
>> >>> Could you provide full RadosGW's log for the failed request?
>> >>> I mean the lines starting from header listing, through the start
>> >>> marker ("====== starting new request...") till the end marker?
>> >>>
>> >>> At the moment we can't see any details related to the signature
>> >>> calculation.
>> >>>
>> >>> Regards,
>> >>> Radek
>> >>>
>> >>> On Thu, Apr 20, 2017 at 5:08 PM, Ben Morrice <ben.morrice@xxxxxxx>
>> >>> wrote:
>> >>>>
>> >>>> Hi all,
>> >>>>
>> >>>> I have tried upgrading one of our RGW servers from 10.2.5 to 10.2.7
>> >>>> (RHEL7)
>> >>>> and authentication is in a very bad state. This installation is part
>> >>>> of
>> >>>> a
>> >>>> multigw configuration, and I have just updated one host in the
>> >>>> secondary
>> >>>> zone (all other hosts/zones are running 10.2.5).
>> >>>>
>> >>>> On the 10.2.7 server I cannot authenticate as a user (normally backed
>> >>>> by
>> >>>> OpenStack Keystone), but even worse I can also not authenticate with
>> >>>> an
>> >>>> admin user.
>> >>>>
>> >>>> Please see [1] for the results of performing a list bucket operation
>> >>>> with
>> >>>> python boto (script works against rgw 10.2.5)
>> >>>>
>> >>>> Also, if I try to authenticate from the 'master' rgw zone with a
>> >>>> "radosgw-admin sync status --rgw-zone=bbp-gva-master" I get:
>> >>>>
>> >>>> "ERROR: failed to fetch datalog info"
>> >>>>
>> >>>> "failed to retrieve sync info: (13) Permission denied"
>> >>>>
>> >>>> The above errors correlates to the errors in the log on the server
>> >>>> running
>> >>>> 10.2.7 (debug level 20) at [2]
>> >>>>
>> >>>> I'm not sure what I have done wrong or can try next?
>> >>>>
>> >>>> By the way, downgrading the packages from 10.2.7 to 10.2.5 returns
>> >>>> authentication functionality
>> >>>>
>> >>>> [1]
>> >>>> boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
>> >>>> <?xml version="1.0"
>> >>>>
>> >>>>
>> >>>> encoding="UTF-8"?><Error><Code>SignatureDoesNotMatch</Code><RequestId>tx000000000000000000004-0058f8c86a-3fa2959-bbp-gva-secondary</RequestId><HostId>3fa2959-bbp-gva-secondary-bbp-gva</HostId></Error>
>> >>>>
>> >>>> [2]
>> >>>> /bbpsrvc15.cscs.ch/admin/log
>> >>>> 2017-04-20 16:43:04.916253 7ff87c6c0700 15 calculated
>> >>>> digest=Ofg/f/NI0L4eEG1MsGk4PsVscTM=
>> >>>> 2017-04-20 16:43:04.916255 7ff87c6c0700 15
>> >>>> auth_sign=qZ3qsy7AuNCOoPMhr8yNoy5qMKU=
>> >>>> 2017-04-20 16:43:04.916255 7ff87c6c0700 15 compare=34
>> >>>> 2017-04-20 16:43:04.916266 7ff87c6c0700 10 failed to authorize
>> >>>> request
>> >>>> 2017-04-20 16:43:04.916268 7ff87c6c0700 20 handler->ERRORHANDLER:
>> >>>> err_no=-2027 new_err_no=-2027
>> >>>> 2017-04-20 16:43:04.916329 7ff87c6c0700  2 req 354:0.052585:s3:GET
>> >>>> /admin/log:get_obj:op status=0
>> >>>> 2017-04-20 16:43:04.916339 7ff87c6c0700  2 req 354:0.052595:s3:GET
>> >>>> /admin/log:get_obj:http status=403
>> >>>> 2017-04-20 16:43:04.916343 7ff87c6c0700  1 ====== req done
>> >>>> req=0x7ff87c6ba710 op status=0 http_status=403 ======
>> >>>> 2017-04-20 16:43:04.916350 7ff87c6c0700 20 process_request() returned
>> >>>> -2027
>> >>>> 2017-04-20 16:43:04.916390 7ff87c6c0700  1 civetweb: 0x7ff990015610:
>> >>>> 10.80.6.26 - - [20/Apr/2017:16:43:04 +0200] "GET /admin/log HTTP/1.1"
>> >>>> 403 0
>> >>>> - -
>> >>>> 2017-04-20 16:43:04.917212 7ff9777e6700 20
>> >>>> cr:s=0x7ff97000d420:op=0x7ff9703a5440:18RGWMetaSyncShardCR: operate()
>> >>>> 2017-04-20 16:43:04.917223 7ff9777e6700 20 rgw meta sync:
>> >>>> incremental_sync:1544: shard_id=20
>> >>>> mdlog_marker=1_1492686039.901886_5551978.1
>> >>>> sync_marker.marker=1_1492686039.901886_5551978.1 period_marker=
>> >>>> 2017-04-20 16:43:04.917227 7ff9777e6700 20 rgw meta sync:
>> >>>> incremental_sync:1551: shard_id=20 syncing mdlog for shard_id=20
>> >>>> 2017-04-20 16:43:04.917236 7ff9777e6700 20
>> >>>> cr:s=0x7ff97000d420:op=0x7ff970066b80:24RGWCloneMetaLogCoroutine:
>> >>>> operate()
>> >>>> 2017-04-20 16:43:04.917238 7ff9777e6700 20 rgw meta sync: operate:
>> >>>> shard_id=20: init request
>> >>>> 2017-04-20 16:43:04.917240 7ff9777e6700 20
>> >>>> cr:s=0x7ff97000d420:op=0x7ff970066b80:24RGWCloneMetaLogCoroutine:
>> >>>> operate()
>> >>>> 2017-04-20 16:43:04.917241 7ff9777e6700 20 rgw meta sync: operate:
>> >>>> shard_id=20: reading shard status
>> >>>> 2017-04-20 16:43:04.917303 7ff9777e6700 20 run: stack=0x7ff97000d420
>> >>>> is
>> >>>> io
>> >>>> blocked
>> >>>> 2017-04-20 16:43:04.918285 7ff9777e6700 20
>> >>>> cr:s=0x7ff97000d420:op=0x7ff970066b80:24RGWCloneMetaLogCoroutine:
>> >>>> operate()
>> >>>> 2017-04-20 16:43:04.918295 7ff9777e6700 20 rgw meta sync: operate:
>> >>>> shard_id=20: reading shard status complete
>> >>>> 2017-04-20 16:43:04.918307 7ff9777e6700 20 rgw meta sync: shard_id=20
>> >>>> marker=1_1492686039.901886_5551978.1 last_update=2017-04-20
>> >>>> 13:00:39.0.901886s
>> >>>> 2017-04-20 16:43:04.918316 7ff9777e6700 20
>> >>>> cr:s=0x7ff97000d420:op=0x7ff970066b80:24RGWCloneMetaLogCoroutine:
>> >>>> operate()
>> >>>> 2017-04-20 16:43:04.918317 7ff9777e6700 20 rgw meta sync: operate:
>> >>>> shard_id=20: sending rest request
>> >>>> 2017-04-20 16:43:04.918381 7ff9777e6700 20 RGWEnv::set(): HTTP_DATE:
>> >>>> Thu
>> >>>> Apr
>> >>>> 20 14:43:04 2017
>> >>>> 2017-04-20 16:43:04.918390 7ff9777e6700 20 > HTTP_DATE -> Thu Apr 20
>> >>>> 14:43:04 2017
>> >>>> 2017-04-20 16:43:04.918404 7ff9777e6700 10 get_canon_resource():
>> >>>> dest=/admin/log
>> >>>> 2017-04-20 16:43:04.918406 7ff9777e6700 10 generated canonical
>> >>>> header:
>> >>>> GET
>> >>>>
>> >>>> --
>> >>>> Kind regards,
>> >>>>
>> >>>> Ben Morrice
>> >>>>
>> >>>>
>> >>>> ______________________________________________________________________
>> >>>> Ben Morrice | e: ben.morrice@xxxxxxx | t: +41-21-693-9670
>> >>>> EPFL / BBP
>> >>>> Biotech Campus
>> >>>> Chemin des Mines 9
>> >>>> 1202 Geneva
>> >>>> Switzerland
>> >>>>
>> >>>> _______________________________________________
>> >>>> ceph-users mailing list
>> >>>> ceph-users@xxxxxxxxxxxxxx
>> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> Łukasz Jagiełło
> lukasz<at>jagiello<dot>org
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux