On Thu, Oct 23, 2014 at 3:51 PM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote: > I'm having a problem getting RadosGW replication to work after upgrading to > Apache 2.4 on my primary test cluster. Upgrading the secondary cluster to > Apache 2.4 doesn't cause any problems. Both Ceph's apache packages and > Ubuntu's packages cause the same problem. > > I'm pretty sure I'm missing something obvious, but I'm not seeing it. > > Has anybody else upgraded their federated gateways to apache 2.4? > > > > My setup > 2 VMs, each running their own ceph cluster with replication=1 > test0-ceph.cdlocal is the primary zone, named us-west > test1-ceph.cdlocal is the secondary zone, named us-central > Before I start, replication works, and I'm running > > Ubuntu 14.04 LTS > Emperor (0.72.2-1precise, retained using apt-hold) > Apache 2.2 (2.2.22-2precise.ceph, retained using apt-hold) > > > As soon as I upgrade Apache to 2.4 in the primary cluster, replication gets > permission errors. radosgw-agent.log: > 2014-10-23T15:13:43.022 31106:ERROR:radosgw_agent.worker:failed to sync > object bucket3/test6.jpg: state is error > > The access logs from the primary say (using vhost_combined log format): > test0-ceph.cdlocal:80 172.16.205.1 - - [23/Oct/2014:15:16:51 -0700] "PUT > /test6.jpg HTTP/1.1" 200 209 "-" "-"- - - [23/Oct/2014:13:24:18 -0700] "GET > /?delimiter=/ HTTP/1.1" 200 1254 "-" "-" "bucket3.test0-ceph.cdlocal" > <snip> > test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] "GET > /admin/log?marker=00000000089.89.3&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2&max-entries=1000 > HTTP/1.1" 200 398 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic" > test0-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:17:34 -0700] "GET > /bucket3/test6.jpg?rgwx-uid=us-central&rgwx-region=us&rgwx-prepend-metadata=us > HTTP/1.1" 403 249 "-" "-" > > 172.16.205.143 is the primary cluster, .144 is the secondary cluster, and .1 > is my workstation. > > > The access logs on the secondary show: > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "GET > /admin/replica_log?bounds&type=bucket-index&bucket-instance=bucket3%3Aus-west.5697.2 > HTTP/1.1" 200 643 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic" > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "PUT > /bucket3/test6.jpg?rgwx-op-id=test1-ceph0.cdlocal%3A6484%3A3&rgwx-source-zone=us-west&rgwx-client-id=radosgw-agent > HTTP/1.1" 403 286 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic" > test1-ceph.cdlocal:80 172.16.205.144 - - [23/Oct/2014:15:18:07 -0700] "GET > /admin/opstate?client-id=radosgw-agent&object=bucket3%2Ftest6.jpg&op-id=test1-ceph0.cdlocal%3A6484%3A3 > HTTP/1.1" 200 355 "-" "Boto/2.20.1 Python/2.7.6 Linux/3.13.0-37-generic" > > If I crank up radosgw debugging, it tells me that the calculated digest is > correct for the /admin/* requests, but fails for the object GET: > /admin/log > 2014-10-23 15:44:29.257688 7fa6fcfb9700 15 calculated > digest=6Tt13P6naWJEc0mJmYyDj6NzBS8= > 2014-10-23 15:44:29.257690 7fa6fcfb9700 15 > auth_sign=6Tt13P6naWJEc0mJmYyDj6NzBS8= > /bucket3/test6.jpg > 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated > digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U= > 2014-10-23 15:44:29.257691 7fa6fcfb9700 15 compare=0 > 2014-10-23 15:44:29.257693 7fa6fcfb9700 20 system request > <snip> > /bucket3/test6.jpg > 2014-10-23 15:44:29.411572 7fa6fc7b8700 15 calculated > digest=pYWIOwRxCh4/bZ/D7b9RnS7RT1U= > 2014-10-23 15:44:29.411573 7fa6fc7b8700 15 > auth_sign=Gv398QNc6gLig9/0QbdO+1UZUq0= > 2014-10-23 15:44:29.411574 7fa6fc7b8700 15 compare=-41 > 2014-10-23 15:44:29.411577 7fa6fc7b8700 10 failed to authorize request > > That explains the 403 responses. > > So I have metadata replication working, but the data replication is failing > with permission problems. I verified that I can create users and buckets in > the primary, and have them replicate to the secondary. > > > A similar situation was posted to the list before. That time, the problem > was that the system users weren't correctly deployed to both the primary and > secondary clusters. I verified that both users exist in both clusters, with > the same access and secret. > > Just to test, I used s3cmd. I can read and write to both clusters using > both system user's credentials. > > > Anybody have any ideas? > You're hitting issue #9206. Apache 2.4 filters out certain http headers because they use underscores instead of dashes. There's a fix for that for firefly, although it hasn't made it to an officially released version. Yehuda _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com