This is very helpful, I'll take a look at it. Thanks, Pritha On Thu, Jul 11, 2024 at 8:04 PM Ryan Rempel <rgrempel@xxxxxx> wrote: > Thanks! > > I took a crack at it myself, and have some work-in-progress here: > > https://github.com/cmu-rgrempel/ceph/pull/1 > > Feel free to use any of that if you like it. It's working for me, but I've > only tested it with Azure AD – I haven't tested the cases that it used to > work for. (I believe it doesn't break them, but haven't tested). > > -- > > Ryan Rempel > > > ------------------------------ > *From:* Pritha Srivastava <prsrivas@xxxxxxxxxx> > *Sent:* Monday, July 8, 2024 10:38 PM > > Hi Ryan, > > This appears to be a known issue and is tracked here: > https://tracker.ceph.com/issues/54562. There is a workaround mentioned in > the tracker that has worked and you can try that. Otherwise, I will be > working on this 'invalid padding' problem very soon. > > Thanks, > Pritha > > On Tue, Jul 9, 2024 at 1:16 AM Ryan Rempel <rgrempel@xxxxxx> wrote: > > I'm trying to setup the OIDC provider for RGW so that I can have roles > that can be assumed by people logging into their regular Azure AD > identities. The client I'm planning to use is Cyberduck – it seems like one > of the few GUI S3 clients that manages the OIDC login process in a way that > could work for relatively naive users. > > I've gotten a fair ways down the road. I've been able to configure > Cyberduck so that it performs the login with Azure AD, gets an identity > token, and then sends it to Ceph to engage with the > AssumeRoleWithWebIdentity process. However, I then get an error, which > shows up in the Ceph rgw logs like this: > > 2024-07-08T17:18:09.749+0000 7fb2d7845700 0 req 15967124976712370684 > 1.284013867s sts:assume_role_web_identity Signature validation failed: evp > verify final failed: 0 error:0407008A:rsa > routines:RSA_padding_check_PKCS1_type_1:invalid padding > > I turned the logging for rgw up to 20 to see if I could follow along to > see how much of the process succeeds and learn more about what fails. I can > then see logging messages from this file in the source code: > > > https://github.com/ceph/ceph/blob/08d7ff952d78d1bbda04d5ff7e3db1e733301072/src/rgw/rgw_rest_sts.cc > > We get to WebTokenEngine::get_from_jwt, and it logs the JWT payload in a > way that seems to be as expected. The logs then indicate that a request is > sent to the /.well-known/openid-configuration endpoint that appears to be > appropriate for the issuer of the JWT. The logs eventually indicate what > looks like a successful and appropriate response to that. The logs then > show that a request is sent to the jwks_uri that is indicated in the > openid-configuration document. The response to that is logged, and it > appears to be appropriate. > > We then get some logging starting with "Certificate is", so it looks like > we're getting as far as WebTokenEngine::validate_signature. So, several > things appear to have happened successfully – we've loading the OIDC > provider that corresponds to the iss, and we've found a client ID that > corresponds to what I registered when I configured things. (This is why I > say we appear to be a fair ways down the road – a lot of this is working). > > It looks as though what's happening in the code now is that it's iterating > through the certificates given in the jwks_uri content. There are 6 > certificates listed, but the code only gets as far as the first one. > Looking at the code, what appears to be happening is that, among the > various certificates in the jwks_uri, it's finding the first one which > matches a thumbprint registered with Ceph (that is, which I registered with > Ceph). This must be succeeding (for the first certificate), because the > "Signature validation failed" logging comes later. So, the code does verify > that the thumbprint of the first certificate matches one of the thumbprints > I registered with Ceph for this OIDC provider. > > We then get to a part of the code where it tries to verify the JWT using > the certificate, with jwt::verify. Given what gets logged ("Signature > validateion failed: ", this must be throwing an exception. > > The thing I find surprising about this is that there really isn't any > reason to think that the first certificate listed in the jwks_uri content > is going to be the certificate used to sign the JWT. If I understand JWT > correctly, it's appropriate to sign the JWT with any of the certificates > listed in the jwks_uri content. Furthermore, the JWT header includes a > reference to the kid, so it's possible for Ceph to know exactly which > certificate the JWT purports to be signed by. And, Ceph knows that there > might be multiple thumbprints, because we can register 5. So, the logic of > trying the first valid certificate in x5c and then stopping if it fails > seems broken, actually. > > I suppose what I could do as a workaround is try to figure out whether > Azure AD is consistently using the same kid to sign the JWTs for me, and > then only register that thumbprint with Ceph. Then, Ceph would actually > choose the correct certificate (as the others wouldn't match a thumbprint I > registered). I may try this – in part, just to verify what I think is > happening. But it would be awfully fragile – I don't believe there is any > requirement in JWT to just use one of the certificates listed in x5c. > > An alternative would be to try rewriting the code to apply a different > kind of logic. The way it ought to work (it seems to me) is something like > this: > > > * > Get the openid_configuration, and get the jwks stuff from the jwks_uri > (which Ceph does already). > * > Look at the header of the JWT to see which kid it purports to be signed by. > * > Find the certificate that corresponds to that kid (from the jwks_uri > content) > * > Validate the JWT with that certificate. > > That ought to work, at least given what I'm seeing. (But, I'm not a JWT > expert, so I don't know whether there is something unusual in how Azure AD > generates JWT's and handles the jwks_uri content). > > Anyway, I'm curious whether anyone else has been trying to get this to > work with Azure AD, and whether they have run into similar problems. And, > of course, whether I appear to be misunderstanding anything about how this > is supposed to work. > > > Ryan Rempel > > Director of Information Technology > > Canadian Mennonite University > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx