Hey all, I'm creating a new post for this issue as we've narrowed the problem down to a partsize limitation on multipart upload. We have discovered that in our production Nautilus (14.2.11) cluster and our lab Nautilus (14.2.10) cluster that multipart uploads with a configured part size of greater than 16777216 bytes (16MiB) will return a status 500 / internal server error from radosgw. So far I have increased the following rgw settings/values that looked suspect, without any success/improvement with partsizes. Such as: "rgw_get_obj_window_size": "16777216", "rgw_put_obj_min_window_size": "16777216", I am trying to determine if this is because of a conservative default setting somewhere that I don't know about or if this is perhaps a bug? I would appreciate it if someone on Nautilus with rgw could also test / provide feedback. It's very easy to reproduce and configuring your partsize with aws2cli requires you to put the following in your aws 'config' s3 = multipart_chunksize = 32MB rgw server logs during a failed multipart upload (32MB chunk/partsize): 2020-09-08 15:59:36.054 7f2d32fa6700 1 ====== starting new request req=0x55953dc36930 ===== 2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed 2020-09-08 15:59:36.138 7f2d32fa6700 1 ====== req done req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s ====== 2020-09-08 16:00:07.285 7f2d3dfbc700 1 ====== starting new request req=0x55953dc36930 ===== 2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed 2020-09-08 16:00:07.353 7f2d00741700 1 ====== starting new request req=0x55954dd5e930 ===== 2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed 2020-09-08 16:00:07.413 7f2cc56cb700 1 ====== starting new request req=0x55953dc02930 ===== 2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed 2020-09-08 16:00:07.473 7f2cb26a5700 1 ====== starting new request req=0x5595426f6930 ===== 2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed 2020-09-08 16:00:09.465 7f2d3dfbc700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.465 7f2d3dfbc700 1 ====== req done req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s ====== 2020-09-08 16:00:09.549 7f2d00741700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.549 7f2d00741700 1 ====== req done req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s ====== 2020-09-08 16:00:09.605 7f2cc56cb700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.609 7f2cc56cb700 1 ====== req done req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s ====== 2020-09-08 16:00:09.641 7f2cb26a5700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.641 7f2cb26a5700 1 ====== req done req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s ====== awscli client side output during a failed multipart upload: root@jump:~# aws --no-verify-ssl --endpoint-url http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile s3://troubleshooting upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error occurred (UnknownError) when calling the UploadPart operation (reached max retries: 2): Unknown Thanks, Jared Baker Cloud Architect for the Cancer Genome Collaboratory Ontario Institute for Cancer Research _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx