radosgw multi file upload failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
Recently we've observed on one of our ceph clusters that uploading of a large number of small files(~2000x2k) fails. The http return code shows 200 but the file upload fails. Here is an e.g of the log 

2018-06-27 07:34:40.624103 7f0dc67cc700  1 ====== starting new request req=0x7f0dc67c68a0 =====
2018-06-27 07:34:40.645039 7f0dc3fc7700  1 ====== starting new request req=0x7f0dc3fc18a0 =====
2018-06-27 07:34:40.682108 7f0dc3fc7700  0 WARNING: couldn't find acl header for object, generating default
2018-06-27 07:34:40.962674 7f0dcbfd7700  0 ERROR: client_io->complete_request() returned -5
2018-06-27 07:34:40.962689 7f0dcbfd7700  1 ====== req done req=0x7f0dcbfd18a0 op status=0 http_status=200 ======
2018-06-27 07:34:40.962738 7f0dcbfd7700  1 civetweb: 0x7f0df4004160: 10.x.x.x. - - [27/Jun/2018:07:34:34 +0000] "POST xxxx-xxxx HTTP/1.1" 200 0 - aws-sdk-java/1.6.4 Linux/3.17.6-200.fc20.x86_64 Java_HotSpot(TM)_64-Bit_Server_VM/25.73-b02

I tried tuning the performance using the below parameters but the number of file upload failures still exists so I'm suspecting this is not a concurrency issue. 
rgw num rados handles = 8
rgw thread pool size = 512
rgw frontends = civetweb port=7480 num_threads=512

I also tried increasing the logging level for rgw and civetweb to 20/5 but I dont see anything that can point to the issue. 

2018-06-28 18:00:24.575460 7f3d7dfc3700 20 get_obj_state: s->obj_tag was set empty
2018-06-28 18:00:24.575491 7f3d7dfc3700 20 get_obj_state: rctx=0x7f3d7dfbcff0 obj=files:_multipart_xxxx-xxxx.error.2~Rh1AqHvzgCPc0NGWMl-FHE0Y-HvCcmk.1 state=0x7f3e04024d88 s->prefetch_data=0
2018-06-28 18:00:24.575496 7f3d7dfc3700 20 prepare_atomic_modification: state is not atomic. state=0x7f3e04024d88
2018-06-28 18:00:24.575555 7f3d7dfc3700 20 reading from default.rgw.data.root:.bucket.meta.files:xxxx-xxxx.6432.11
2018-06-28 18:00:24.575567 7f3d7dfc3700 20 get_system_obj_state: rctx=0x7f3d7dfbb5d0 obj=default.rgw.data.root:.bucket.meta.files:xxxx-xxxx.6432.11 state=0x7f3e04001228 s->prefetch_data
2018-06-28 18:00:24.575577 7f3d7dfc3700 10 cache get: name=default.rgw.data.root+.bucket.meta.files:xxxx-xxxx.6432.11 : hit (requested=22, cached=23)
2018-06-28 18:00:24.575586 7f3d7dfc3700 20 get_system_obj_state: s->obj_tag was set empty
2018-06-28 18:00:24.575592 7f3d7dfc3700 10 cache get: name=default.rgw.data.root+.bucket.meta.files:xxxx-xxxx.6432.11 : hit (requested=17, cached=23)
2018-06-28 18:00:24.575614 7f3d7dfc3700 20  bucket index object: .dir.xxxx-xxxx.6432.11
2018-06-28 18:00:24.606933 7f3d67796700  2 req 9567:5.505460:s3:POST xxxx-xxxx.error:init_multipart:completing
2018-06-28 18:00:24.607025 7f3d67796700  0 ERROR: client_io->complete_request() returned -5
2018-06-28 18:00:24.607036 7f3d67796700  2 req 9567:5.505572:s3:POST xxxx-xxxx.error:init_multipart:op status=0
2018-06-28 18:00:24.607040 7f3d67796700  2 req 9567:5.505578:s3:POST xxxx-xxxx.error:init_multipart:http status=200
2018-06-28 18:00:24.607046 7f3d67796700  1 ====== req done req=0x7f3d677908a0 op status=0 http_status=200 ======

The cluster is a 12 node Ceph jewel(10.2.10-1~bpo80+1) one. Operating system is Debian 8.9
ceph.conf
[global]
fsid = 314d4121-46b1-4433-9bae-fdd2803fc24b
mon_initial_members = ceph-1,ceph-2,ceph-3
mon_host = 10.x.x.x, 10.x.x.x, 10.x.x.x
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public network = 10.x.x.x 
osd_journal_size = 10240 
osd_mount_options_xfs = rw,noexec,noatime,nodiratime,inode64 
osd_pool_default_size = 3 
osd_pool_default_min_size = 2 
osd_pool_default_pg_num = 900 
osd_pool_default_pgp_num = 900 
log to syslog = true 
err to syslog = true 
clog to syslog = true 
rgw dns name = xxx.com
rgw num rados handles = 8
rgw thread pool size = 512
rgw frontends = civetweb port=7480 num_threads=512
debug rgw = 20/5
debug civetweb = 20/5

[mon]
mon cluster log to syslog = true


Any idea what the issue could be here? 

Thanks
Mel

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux