Hi, We
have two CEPH clusters running on Dumpling 0.67.11 and some of our
"multipart objects" are incompleted. It seems that some slow requests
could cause corruption of related S3 objects. Moveover GETs for that objects
are working without any error messages. There are only HTTP 200 in logs as well
as no information about problems from popular client tools/libs. The
situation looks very similiar to described in bug #8269, but we are using fixed
0.67.11 version: http://tracker.ceph.com/issues/8269 Regards, Sławomir
Ritter EXAMPLE#1 slow_request ============ 2016-02-23
13:49:58.818640 osd.260 10.176.67.27:6800/688083 2119 : [WRN] 4 slow requests,
4 included below; oldest blocked for > 30.727096 secs 2016-02-23
13:49:58.818673 osd.260 10.176.67.27:6800/688083 2120 : [WRN] slow request
30.727096 seconds old, received at 2016-02-23 13:49:28.091460: osd_op(c lient.47792965.0:185007087
default.14654.445__shadow_c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv.b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z.57_2
[writef ull
0~524288] 10.ce729ebe e107594) v4 currently waiting for subops from [469,9] HTTP_500
in apache.log ====================== 127.0.0.1
- - [23/Feb/2016:13:49:27 +0100] "PUT
/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv?uploadId=b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z&partNumber=56
HTTP/1.0" 200 221 "-" "Boto/2.31.1 Python/2.7.3 Linux/3.13.0-39-generic(syncworker)" 127.0.0.1
- - [23/Feb/2016:13:49:28 +0100] "PUT
/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv?uploadId=b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z&partNumber=57
HTTP/1.0" 500 751 "-" "Boto/2.31.1 Python/2.7.3 Linux/3.13.0-39-generic(syncworker)" 127.0.0.1
- - [23/Feb/2016:13:49:58 +0100] "PUT
/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv?uploadId=b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z&partNumber=57
HTTP/1.0" 200 221 "-" "Boto/2.31.1 Python/2.7.3 Linux/3.13.0-39-generic(syncworker)" 127.0.0.1
- - [23/Feb/2016:13:49:59 +0100] "PUT
/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv?uploadId=b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z&partNumber=58
HTTP/1.0" 200 221 "-" "Boto/2.31.1 Python/2.7.3 Linux/3.13.0-39-generic(syncworker)" Empty
RADOS object (real size = 0 bytes), list generated basis on MANIFEST ========================================================================== found
default.14654.445__shadow_c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv.b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z.56_2
2097152 ok
2097152 10.7acc9476 (10.1476) [278,142,436]
[278,142,436] found
default.14654.445__multipart_c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv.b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z.57
0
diff
4194304 10.4f5be025 (10.25)
[57,310,428] [57,310,428] found
default.14654.445__shadow_c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv.b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z.57_1
4194304 ok
4194304 10.81191602 (10.1602) [441,109,420]
[441,109,420] found
default.14654.445__shadow_c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv.b73N3OmW4OhCjDYSR-RTkZNNIKA1C9Z.57_2
2097152 ok
2097152 10.ce729ebe (10.1ebe)
[260,469,9] [260,469,9] "Silent"
GETs ============= # object
size from headers $
s3 -u head
video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv
Content-Type: binary/octet-stream Content-Length:
641775701 Server:
nginx #
but GETs only 637581397 (641775701 - missing 4194304 = 637581397) $
s3 -u get
video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv >
/tmp/test $ ls
-al /tmp/test -rw-r--r--
1 root root 637581397 Feb 23 17:05 /tmp/test #
no error in logs 127.0.0.1
- - [23/Feb/2016:17:05:00 +0100] "GET /video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv
HTTP/1.0" 200 637581711 "-" "Mozilla/4.0 (Compatible; s3;
libs3 2.0; Linux x86_64)" # wget - retry for missing part, but there is no missing part,
so it GETs head/tail of the file again.... $
wget
http://127.0.0.1:88/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv --2016-02-23
17:10:11--
http://127.0.0.1:88/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv Connecting
to 127.0.0.1:88... connected. HTTP
request sent, awaiting response... 200 OK Length:
641775701 (612M) [binary/octet-stream] Saving
to: `c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv' 99%
[==================================================================================================================>
] 637,581,397 63.9M/s in 9.5s 2016-02-23
17:10:20 (64.1 MB/s) - Connection closed at byte 637581397. Retrying. --2016-02-23
17:10:21-- (try: 2)
http://127.0.0.1:88/video-shbc/c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv Connecting
to 127.0.0.1:88... connected. HTTP
request sent, awaiting response... 206 Partial Content Length:
641775701 (612M), 4194304 (4.0M) remaining [binary/octet-stream] Saving
to: `c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv' 100%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>]
641,775,701 --.-K/s in 0.007s 2016-02-23
17:10:21 (601 MB/s) -
`c9f8db1b-cee2-4ec8-8fb3-8b4bc7585d80.1456231572.877051.ismv' saved
[641775701/641775701] |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com