Re: RGW segmentation fault on Pacific 16.2.1 with multipart upload

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This tracker:
https://tracker.ceph.com/issues/50556

and this PR:
https://github.com/ceph/ceph/pull/41288

Daniel

On 5/12/21 7:00 AM, Daniel Iwan wrote:
Hi
I have started to see segfaults during multiplart upload to one of the
buckets
File is about 60MB in size
Upload of the same file to a brand new bucket works OK

Command used
aws --profile=tester --endpoint=$HOST_S3_API --region="" s3 cp
./pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack
s3://tester-bucket/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack

For some reason log shows upload to  tester-bucket-2 ???
Bucket tester-bucket-2 is owned by the same user TESTER.

I'm using Ceph 16.2.1 (recently upgraded from Octopus).
Installed with cephadm in Docker
OS Ubuntu 18.04.5 LTS

Logs show as below

May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.891+0000 7ffb0e25e700  1 ====== starting new request
req=0x7ffa8e15d620 =====
May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.907+0000 7ffb0b258700  1 ====== req done
req=0x7ffa8e15d620 op status=0 http_status=200 latency=0.011999841s ======
May 11 11:00:46 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:46.907+0000 7ffb0b258700  1 beast: 0x7ffa8e15d620:
11.1.150.14 - TESTER [11/May/2021:11:00:46.891 +0000] "POST
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploads
HTTP/1.1" 200 296 - "aws-cli/2.1.23 Python/3.7.3
Linux/4.19.128-microsoft-standard exe/x86_64.ubuntu.18 p
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.055+0000 7ffb09254700  1 ====== starting new request
req=0x7ffa8e15d620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+0000 7ffb51ae5700  1 ====== starting new request
req=0x7ffa8e0dc620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+0000 7ffb4eadf700  1 ====== starting new request
req=0x7ffa8e05b620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+0000 7ffb46acf700  1 ====== starting new request
req=0x7ffa8df59620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+0000 7ffb44acb700  1 ====== starting new request
req=0x7ffa8ded8620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.355+0000 7ffb3dabd700  1 ====== starting new request
req=0x7ffa8dfda620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.359+0000 7ffb1d27c700  1 ====== starting new request
req=0x7ffa8de57620 =====
May 11 11:00:47 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:47.359+0000 7ffb22a87700  1 ====== starting new request
req=0x7ffa8ddd6620 =====
May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:48.275+0000 7ffb2d29c700  1 ====== req done
req=0x7ffa8e15d620 op status=0 http_status=200 latency=1.219983697s ======
May 11 11:00:48 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:48.275+0000 7ffb2d29c700  1 beast: 0x7ffa8e15d620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.055 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=8
HTTP/1.1" 200 2485288 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:54.695+0000 7ffad89f3700  1 ====== req done
req=0x7ffa8ddd6620 op status=0 http_status=200 latency=7.335902214s ======
May 11 11:00:54 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:54.695+0000 7ffad89f3700  1 beast: 0x7ffa8ddd6620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=6
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:56.871+0000 7ffb11a65700  1 ====== req done
req=0x7ffa8e0dc620 op status=0 http_status=200 latency=9.515872955s ======
May 11 11:00:56 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:56.871+0000 7ffb11a65700  1 beast: 0x7ffa8e0dc620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=7
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:59.491+0000 7ffac89d3700  1 ====== req done
req=0x7ffa8dfda620 op status=0 http_status=200 latency=12.135838509s ======
May 11 11:00:59 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:00:59.491+0000 7ffac89d3700  1 beast: 0x7ffa8dfda620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=2
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:02.891+0000 7ffb68312700  1 ====== req done
req=0x7ffa8e05b620 op status=0 http_status=200 latency=15.535793304s ======
May 11 11:01:02 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:02.891+0000 7ffb68312700  1 beast: 0x7ffa8e05b620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=4
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.299+0000 7ffb70b23700  1 ====== req done
req=0x7ffa8df59620 op status=0 http_status=200 latency=15.943787575s ======
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.299+0000 7ffb70b23700  1 beast: 0x7ffa8df59620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=3
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.647+0000 7ffb8534c700  1 ====== req done
req=0x7ffa8ded8620 op status=0 http_status=200 latency=16.291782379s ======
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.647+0000 7ffb8534c700  1 beast: 0x7ffa8ded8620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.355 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=1
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.835+0000 7ffabe9bf700  1 ====== req done
req=0x7ffa8de57620 op status=0 http_status=200 latency=16.475780487s ======
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.835+0000 7ffabe9bf700  1 beast: 0x7ffa8de57620:
11.1.150.14 - TESTER [11/May/2021:11:00:47.359 +0000] "PUT
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF&partNumber=5
HTTP/1.1" 200 8388608 - "aws-cli/2.1.23 Python/3.7.3 Linux
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.875+0000 7ffabf1c0700  1 ====== starting new request
req=0x7ffa8de57620 =====
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.895+0000 7ffaa0983700  1 ====== req done
req=0x7ffa8de57620 op status=0 http_status=200 latency=0.019999731s ======
May 11 11:01:03 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:03.895+0000 7ffaa0983700  1 beast: 0x7ffa8de57620:
11.1.150.14 - TESTER [11/May/2021:11:01:03.875 +0000] "POST
/tester-bucket-2/pack-a9201afb4682b74c7c5a5d6070e661662bdfea1a.pack?uploadId=2~JhGavMwngl_FH6-LcE2vFxMRjcf4qTF
HTTP/1.1" 200 400 - "aws-cli/2.1.23 Python/3.7.3 Linux/4.19.128-micros
May 11 11:01:06 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:06.147+0000 7ffac31c8700  1 failed to read header: The
socket was closed due to a timeout
May 11 11:01:06 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:06.147+0000 7ffac31c8700  1 ====== req done
http_status=400 ======
May 11 11:01:16 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:16.667+0000 7ffab51ac700  1 failed to read header: The
socket was closed due to a timeout
May 11 11:01:16 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:16.667+0000 7ffab51ac700  1 ====== req done
http_status=400 ======
May 11 11:01:17 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:17.687+0000 7ffaa598d700  1 failed to read header: The
socket was closed due to a timeout
May 11 11:01:17 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:17.687+0000 7ffaa598d700  1 ====== req done
http_status=400 ======
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:18.179+0000 7ffa9e97f700  1 ====== starting new request
req=0x7ffbd40b8620 =====
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: *** Caught signal
(Segmentation fault) **
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  in thread 7ffac89d3700
thread_name:radosgw
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  1:
/lib64/libpthread.so.0(+0x12b20) [0x7ffbc8558b20]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  2:
(rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7ffbd3393403]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  3:
(rgw::sal::RGWObject::get_obj() const+0x20) [0x7ffbd33c1bb0]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  4:
(RGWInitMultipart::verify_permission(optional_yield)+0x6c) [0x7ffbd36abf4c]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  5:
(rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*,
req_state*, optional_yield, bool)+0x898) [0x7ffbd3373f58]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  6:
(process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, rgw::auth::StrategyRegistry const&,
RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*,
std::__cxx11::basic_string<char, std::char_traits
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  7:
/lib64/libradosgw.so.2(+0x49510d) [0x7ffbd32ca10d]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  8:
/lib64/libradosgw.so.2(+0x496b74) [0x7ffbd32cbb74]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  9:
/lib64/libradosgw.so.2(+0x496dde) [0x7ffbd32cbdde]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  10: make_fcontext()
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug
2021-05-11T11:01:18.183+0000 7ffac89d3700 -1 *** Caught signal
(Segmentation fault) **
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  in thread 7ffac89d3700
thread_name:radosgw
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  ceph version 16.2.1
(afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  1:
/lib64/libpthread.so.0(+0x12b20) [0x7ffbc8558b20]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  2:
(rgw_bucket::rgw_bucket(rgw_bucket const&)+0x23) [0x7ffbd3393403]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  3:
(rgw::sal::RGWObject::get_obj() const+0x20) [0x7ffbd33c1bb0]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  4:
(RGWInitMultipart::verify_permission(optional_yield)+0x6c) [0x7ffbd36abf4c]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  5:
(rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*,
req_state*, optional_yield, bool)+0x898) [0x7ffbd3373f58]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  6:
(process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*,
std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, rgw::auth::StrategyRegistry const&,
RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*,
std::__cxx11::basic_string<char, std::char_traits
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  7:
/lib64/libradosgw.so.2(+0x49510d) [0x7ffbd32ca10d]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  8:
/lib64/libradosgw.so.2(+0x496b74) [0x7ffbd32cbb74]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  9:
/lib64/libradosgw.so.2(+0x496dde) [0x7ffbd32cbdde]
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  10: make_fcontext()
May 11 11:01:18 ceph-om-vm-node1 bash[27881]:  NOTE: a copy of the
executable, or `objdump -rdS <executable>` is needed to interpret this.
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: --- begin dump of recent
events ---
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2732>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command assert hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2731>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command abort hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2730>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command leak_some_memory hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2729>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perfcounters_dump hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2728>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command 1 hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2727>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perf dump hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2726>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perfcounters_schema hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2725>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perf histogram dump hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2724>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command 2 hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2723>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perf schema hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2722>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perf histogram schema hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2721>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command perf reset hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2720>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config show hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2719>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config help hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2718>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config set hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2717>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config unset hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2716>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config get hook 0x55f96fdf4580
May 11 11:01:18 ceph-om-vm-node1 bash[27881]: debug  -2715>
2021-05-11T10:57:05.234+0000 7ffbd4103440  5 asok(0x55f96fef0000)
register_command config diff hook 0x55f96fdf4580

Regards
Daniel
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux