Hi,
after renaming our Ceph Luminous cluster (which we use with RGW), we're
unable to get objects stored in RGW via S3 API (classic HTTP get works).
We were migrating cluster from one datacenter to another and by doing
so, we needed to completely rename hostnames of nodes and also change IP
addresses. Everything seems to work OK, osds are up and are
communicating with each other. Debug logs from rgw are:
```
2019-11-19 12:12:13.741766 7fe11788f700 10 -- 10.13.92.24:0/388304201 >>
10.13.92.22:6816/8651 conn(0x55d36797c000 :-1
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=60 cs=1 l=1).process
aborted = 0
2019-11-19 12:12:13.741794 7fe11788f700 5 -- 10.13.92.24:0/388304201 >>
10.13.92.22:6816/8651 conn(0x55d36797c000 :-1
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=60 cs=1 l=1). rx osd.6
seq 1337 0x55d368f66000 osd_op_reply(19204 storage [call,getxattrs,stat]
v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v8
2019-11-19 12:12:13.741810 7fe11788f700 1 -- 10.13.92.24:0/388304201
<== osd.6 10.13.92.22:6816/8651 1337 ==== osd_op_reply(19204 storage
[call,getxattrs,stat] v0'0 uv0 ondisk = -2 ((2) No such file or
directory)) v8 ==== 235+0+0 (249463175 0 0) 0x55d368f66000 con
0x55d36797c000
2019-11-19 12:12:13.741894 7fe11788f700 10 -- 10.13.92.24:0/388304201
dispatch_throttle_release 235 to dispatch throttler 0/0
2019-11-19 12:12:13.741926 7fe11788f700 10 -- 10.13.92.24:0/388304201 >>
10.13.92.22:6816/8651 conn(0x55d36797c000 :-1
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=60 cs=1 l=1).process
aborted = 0
2019-11-19 12:12:13.741938 7fe11788f700 5 -- 10.13.92.24:0/388304201 >>
10.13.92.22:6816/8651 conn(0x55d36797c000 :-1
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=60 cs=1 l=1). rx osd.6
seq 1338 0x55d368f66340 osd_op_reply(19205 storage [call,getxattrs,stat]
v0'0 uv0 ondisk = -2 ((2) No such file or directory)) v8
2019-11-19 12:12:13.741949 7fe11788f700 1 -- 10.13.92.24:0/388304201
<== osd.6 10.13.92.22:6816/8651 1338 ==== osd_op_reply(19205 storage
[call,getxattrs,stat] v0'0 uv0 ondisk = -2 ((2) No such file or
directory)) v8 ==== 235+0+0 (249463175 0 0) 0x55d368f66340 con
0x55d36797c000
2019-11-19 12:12:13.742002 7fe11788f700 10 -- 10.13.92.24:0/388304201
dispatch_throttle_release 235 to dispatch throttler 0/0
2019-11-19 12:12:13.742103 7fe00ae76700 1 civetweb: 0x55d3683f0000:
10.13.14.39 - - [19/Nov/2019:12:12:13 +0100] "GET /prod/zAVa1Jj.5
HTTP/1.0" 404 0 https://XXX Mozilla/5.0 (X11; Linux x86_64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36
```
When turning debug 20/20 on osd.6, I get:
```
2019-11-19 12:13:42.540310 7fa21ddde700 20 osd.6 op_wq(7) _process 4.1f
item PGQueueable(0x564ef3301500 prio 63 cost 11 e5942) queued
2019-11-19 12:13:42.540311 7fa211dc6700 20 osd.6 pg_epoch: 5942 pg[4.1f(
v 969'24 (0'0,969'24] local-lis/les=5930/5931 n=1 ec=245/245 lis/c
5930/5930 les/c/f 5931/5931/0 5930/5930/5870) [6,13,3] r=0 lpr=5930
crt=969'24 lcod 0'0 mlcod 0'0 active+clean] op_has_sufficient_caps
session=0x564ee2e3d980 pool=4 (default.rgw.meta root) owner=0
need_read_cap=1 need_write_cap=0 classes=[class version rd 1 wr 0 wl 1]
-> yes
2019-11-19 12:13:42.540335 7fa2455b4700 20 osd.6 op_wq(7) _enqueue 4.1f
PGQueueable(0x564ef3302840 prio 63 cost 11 e5942)
2019-11-19 12:13:42.540334 7fa211dc6700 10 osd.6 pg_epoch: 5942 pg[4.1f(
v 969'24 (0'0,969'24] local-lis/les=5930/5931 n=1 ec=245/245 lis/c
5930/5930 les/c/f 5931/5931/0 5930/5930/5870) [6,13,3] r=0 lpr=5930
crt=969'24 lcod 0'0 mlcod 0'0 active+clean] do_op
osd_op(client.184761179.0:19383 4.1f 4:fc055f8b:root::storage:head [call
version.read,getxattrs,stat] snapc 0=[] ondisk+read+known_if_redirected
e5942) v8 may_read -> read-ordered flags ondisk+read+known_if_redirected
2019-11-19 12:13:42.540363 7fa211dc6700 10 osd.6 pg_epoch: 5942 pg[4.1f(
v 969'24 (0'0,969'24] local-lis/les=5930/5931 n=1 ec=245/245 lis/c
5930/5930 les/c/f 5931/5931/0 5930/5930/5870) [6,13,3] r=0 lpr=5930
crt=969'24 lcod 0'0 mlcod 0'0 active+clean] get_object_context: obc NOT
found in cache: 4:fc055f8b:root::storage:head
2019-11-19 12:13:42.540379 7fa2455b4700 15 osd.6 5942 enqueue_op
0x564ef33039c0 prio 63 cost 11 latency 0.000013 epoch 5942
osd_op(client.184761179.0:19386 4.1f 4.d1faa03f (undecoded)
ondisk+read+known_if_redirected e5942) v8
2019-11-19 12:13:42.540389 7fa2455b4700 20 osd.6 op_wq(7) _enqueue 4.1f
PGQueueable(0x564ef33039c0 prio 63 cost 11 e5942)
2019-11-19 12:13:42.540430 7fa2455b4700 15 osd.6 5942 enqueue_op
0x564ef3301f80 prio 63 cost 11 latency 0.000017 epoch 5942
osd_op(client.184761179.0:19387 4.1f 4.d1faa03f (undecoded)
ondisk+read+known_if_redirected e5942) v8
2019-11-19 12:13:42.540426 7fa211dc6700 10 osd.6 pg_epoch: 5942 pg[4.1f(
v 969'24 (0'0,969'24] local-lis/les=5930/5931 n=1 ec=245/245 lis/c
5930/5930 les/c/f 5931/5931/0 5930/5930/5870) [6,13,3] r=0 lpr=5930
crt=969'24 lcod 0'0 mlcod 0'0 active+clean] get_object_context: no obc
for soid 4:fc055f8b:root::storage:head and !can_create
2019-11-19 12:13:42.540448 7fa2455b4700 20 osd.6 op_wq(7) _enqueue 4.1f
PGQueueable(0x564ef3301f80 prio 63 cost 11 e5942)
2019-11-19 12:13:42.540449 7fa211dc6700 20 osd.6 pg_epoch: 5942 pg[4.1f(
v 969'24 (0'0,969'24] local-lis/les=5930/5931 n=1 ec=245/245 lis/c
5930/5930 les/c/f 5931/5931/0 5930/5930/5870) [6,13,3] r=0 lpr=5930
crt=969'24 lcod 0'0 mlcod 0'0 active+clean] do_op: find_object_context
got error -2
2019-11-19 12:13:42.540467 7fa2455b4700 15 osd.6 5942 enqueue_op
0x564ef1cddc00 prio 63 cost 11 latency 0.000010 epoch 5942
osd_op(client.184761179.0:19388 4.1f 4.d1faa03f (undecoded)
ondisk+read+known_if_redirected e5942) v8
2019-11-19 12:13:42.540474 7fa211dc6700 10 osd.6 5942 dequeue_op
0x564ef33008c0 finish
```
Strange thing is that upload works via S3 API, only download doesn't (we
get 404). Can someone guide us where to look? Thanks!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com