On 16-02-27 06:09, Yehuda Sadeh-Weinraub wrote:
On Wed, Feb 24, 2016 at 5:48 PM, Ben Hines <bhines@xxxxxxxxx> wrote:
Any idea what is going on here? I get these intermittently, especially with
very large file.
The client is doing RANGE requests on this >51 GB file, incrementally
fetching later chunks.
2016-02-24 16:30:59.669561 7fd33b7fe700 1 ====== starting new request
req=0x7fd32c0879c0 =====
2016-02-24 16:30:59.669675 7fd33b7fe700 2 req 3648804:0.000114::GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg::initializing for
trans_id = tx00000000000000037ad24-0056ce4b43-259914b-default
2016-02-24 16:30:59.669687 7fd33b7fe700 10 host=<redacted>
2016-02-24 16:30:59.669757 7fd33b7fe700 10
s->object=<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg
s->bucket=<redacted>
2016-02-24 16:30:59.669767 7fd33b7fe700 2 req 3648804:0.000206:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg::getting op
2016-02-24 16:30:59.669776 7fd33b7fe700 2 req 3648804:0.000215:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:authorizing
2016-02-24 16:30:59.669785 7fd33b7fe700 2 req 3648804:0.000224:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:reading
permissions
2016-02-24 16:30:59.673797 7fd33b7fe700 10 manifest: total_size =
50346000384
2016-02-24 16:30:59.673841 7fd33b7fe700 2 req 3648804:0.004280:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:init op
2016-02-24 16:30:59.673867 7fd33b7fe700 10 cache get:
name=.users.uid+<redacted> : hit
2016-02-24 16:30:59.673881 7fd33b7fe700 10 cache get:
name=.users.uid+<redacted> : hit
2016-02-24 16:30:59.673921 7fd33b7fe700 2 req 3648804:0.004360:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:verifying
op mask
2016-02-24 16:30:59.673929 7fd33b7fe700 2 req 3648804:0.004369:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:verifying
op permissions
2016-02-24 16:30:59.673941 7fd33b7fe700 5 Searching permissions for
uid=anonymous mask=49
2016-02-24 16:30:59.673944 7fd33b7fe700 5 Permissions for user not found
2016-02-24 16:30:59.673946 7fd33b7fe700 5 Searching permissions for group=1
mask=49
2016-02-24 16:30:59.673949 7fd33b7fe700 5 Found permission: 1
2016-02-24 16:30:59.673951 7fd33b7fe700 5 Searching permissions for group=2
mask=49
2016-02-24 16:30:59.673953 7fd33b7fe700 5 Permissions for group not found
2016-02-24 16:30:59.673955 7fd33b7fe700 5 Getting permissions id=anonymous
owner=<redacted> perm=1
2016-02-24 16:30:59.673957 7fd33b7fe700 10 uid=anonymous requested perm
(type)=1, policy perm=1, user_perm_mask=15, acl perm=1
2016-02-24 16:30:59.673961 7fd33b7fe700 2 req 3648804:0.004400:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:verifying
op params
2016-02-24 16:30:59.673965 7fd33b7fe700 2 req 3648804:0.004404:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:executing
2016-02-24 16:30:59.674107 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=130023424 stripe_ofs=130023424 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:30:59.674193 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=134217728 stripe_ofs=134217728 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:30:59.674317 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=138412032 stripe_ofs=138412032 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:30:59.674433 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=142606336 stripe_ofs=142606336 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:31:00.046110 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=146800640 stripe_ofs=146800640 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:31:00.150966 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=150994944 stripe_ofs=150994944 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:31:00.151118 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=155189248 stripe_ofs=155189248 part_ofs=104857600
rule->part_size=52428800
2016-02-24 16:31:00.161000 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=157286400 stripe_ofs=157286400 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.199553 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=161480704 stripe_ofs=161480704 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.278308 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=165675008 stripe_ofs=165675008 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.312306 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=169869312 stripe_ofs=169869312 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.751626 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=174063616 stripe_ofs=174063616 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.833570 7fd33b7fe700 0 RGWObjManifest::operator++():
result: ofs=178257920 stripe_ofs=178257920 part_ofs=157286400
rule->part_size=52428800
2016-02-24 16:31:00.871774 7fd33b7fe700 0 ERROR: flush_read_list():
d->client_c->handle_data() returned -5
Maybe add 'debug ms = 1'?
Yehuda
I am having similar problem. My output (with object/bucket name
replaced) with debug ms=1
2016-09-05 21:24:36.338651 7f2d7b766700 1 ====== starting new request
req=0x7f2d7b7608a0 =====
2016-09-05 21:24:36.338807 7f2d7b766700 1 -- 10.169.28.44:0/1477608244
--> 10.169.5.2:6820/6099 -- osd_op(client.20247176.0:1771 17.6d5655ad
default.19781482.1_SOMEOBJECT01 [getxattrs,stat] snapc 0=[]
ack+read+known_if_redirected e21024) v7 -- ?+0 0x7f3030006650 con
0x7f2e0801b2b0
2016-09-05 21:24:36.339543 7f2d310cc700 1 -- 10.169.28.44:0/1477608244
<== osd.40 10.169.5.2:6820/6099 35 ==== osd_op_reply(1771
default.19781482.1_SOMEOBJECT01 [getxattrs,stat] v0'0 uv101163 ondisk =
0) v7 ==== 247+0+53526 (1646996105 0 3189628511) 0x7f2ff801ede0 con
0x7f2e0801b2b0
2016-09-05 21:24:36.339958 7f2d7b766700 1 -- 10.169.28.44:0/1477608244
--> 10.169.14.45:6844/71802 -- osd_op(client.20247176.0:1772 17.b0f772b2
default.19781482.1__shadow_SOMEOBJECT01.2~ioN9kG4y9fem98jNURGcd-hiOz0rUcY.141_24
[read 0~4194304] snapc 0=[] ack+read+known_if_redirected e21024) v7 --
?+0 0x7f3030025250 con 0x7f2e7001e580
2016-09-05 21:24:36.339987 7f2d7b766700 1 -- 10.169.28.44:0/1477608244
--> 10.169.5.2:6836/7553 -- osd_op(client.20247176.0:1773 17.40daf8fc
default.19781482.1__shadow_SOMEOBJECT01.2~ioN9kG4y9fem98jNURGcd-hiOz0rUcY.141_25
[read 0~4194304] snapc 0=[] ack+read+known_if_redirected e21024) v7 --
?+0 0x7f3030026020 con 0x7f2ffc003030
2016-09-05 21:24:36.354647 7f2d2ca98700 1 -- 10.169.28.44:0/1477608244
<== osd.71 10.169.14.45:6844/71802 15 ==== osd_op_reply(1596
default.19781482.1__shadow_SOMEOBJECT02.2~mrZUl8RCA1QjnfDAMIFDgbddOQDBMGi.123_42
[read 0~4194304] v0'0 uv101395 ondisk = 0) v7 ==== 258+0+4194304
(2751466370 0 2582022669) 0x7f2e0400d2f0 con 0x7f2e7001e580
2016-09-05 21:24:36.355813 7f2d93f97700 0 ERROR: flush_read_list():
d->client_c->handle_data() returned -5
2016-09-05 21:24:36.356265 7f2d93f97700 0 WARNING: set_req_state_err
err_no=5 resorting to 500
2016-09-05 21:24:36.356290 7f2d93f97700 0 ERROR:
s->cio->send_content_length() returned err=-5
2016-09-05 21:24:36.356293 7f2d93f97700 0 ERROR: s->cio->print()
returned err=-5
2016-09-05 21:24:36.356294 7f2d93f97700 0 ERROR: STREAM_IO(s)->print()
returned err=-5
2016-09-05 21:24:36.356299 7f2d93f97700 0 ERROR:
STREAM_IO(s)->complete_header() returned err=-5
2016-09-05 21:24:36.356348 7f2d93f97700 1 -- 10.169.28.44:0/1477608244
--> 10.169.5.2:6836/7553 -- osd_op(client.20247176.0:1774 11.cddb167c
2016-09-05-21-default.19781482.1-uber-mysql [append 0~556] snapc 0=[]
ack+ondisk+write+known_if_redirected e21024) v7 -- ?+0 0x7f2f7400df40
con 0x7f2ffc003030
2016-09-05 21:24:36.356373 7f2d93f97700 1 ====== req done
req=0x7f2d93f918a0 op status=-5 http_status=500 ======
2016-09-05 21:24:36.356437 7f2d93f97700 1 civetweb: 0x7f2f74001c70:
10.169.26.22 - - [05/Sep/2016:21:24:33 +0000] "GET
/SOME_BUCKET/SOMEOBJECT02 HTTP/1.1" 500 0 - aws-cli/1.10.20 Python/2.7.3
Linux/3.18.27-031827-generic botocore/1.4.11
please note that right before failure there is another rgw object
mentioned. It looks wrong... I am on 10.2.2
2016-02-24 16:31:00.872480 7fd33b7fe700 0 WARNING: set_req_state_err
err_no=5 resorting to 500
2016-02-24 16:31:00.872561 7fd33b7fe700 0 ERROR: s->cio->print() returned
err=-5
2016-02-24 16:31:00.872567 7fd33b7fe700 0 ERROR: s->cio->print() returned
err=-5
2016-02-24 16:31:00.872571 7fd33b7fe700 0 ERROR: s->cio->print() returned
err=-5
2016-02-24 16:31:00.872592 7fd33b7fe700 0 ERROR: s->cio->complete_header()
returned err=-5
2016-02-24 16:31:00.872616 7fd33b7fe700 2 req 3648804:1.203055:s3:GET
/<redacted>/int8-0.181.4-1654016.2016-02-23_03-53-42.pkg:get_obj:http
status=500
2016-02-24 16:31:00.872629 7fd33b7fe700 1 ====== req done
req=0x7fd32c0879c0 http_status=500 ======
ceph.conf relevant settings:
[client.sm-cephrgw5]
rgw enable ops log = False
rgw frontends = civetweb port=8080 num_threads=75
error_log_file=/var/log/ceph/civetweb.error.log
access_log_file=/var/log/ceph/civetweb.access.log
rgw num rados handles = 10
rgw cache lru size = 30000
debug civetweb = 10
rgw override bucket index max shards = 23
debug rgw = 10
Ceph 9.2.0, cluster is HEALTH_OK
-Ben
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com