On Mon, Oct 9, 2017 at 2:26 PM, Ugis <ugis22@xxxxxxxxx> wrote: > We have used "rados put" in some scripts to store files as objects in > rados as that is very conveniet for scripting. Seems that this tool is > not intended for that but just for rados management itself. > Thanks for hint, will have to look at RadosGW and CephFS for storing > large files into rados. Ugis, you could also try out the --striper option of "rados put", which strips the object into 4MB strips. see http://docs.ceph.com/docs/master/man/8/rados/?highlight=striper > > Ugis > > 2017-10-08 22:00 GMT+03:00 Sage Weil <sage@xxxxxxxxxxxx>: >> On Sun, 8 Oct 2017, Ugis wrote: >>> ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable) >>> I'm in process of migrating OSDs to bluestore. Several migrations >>> successful but last one failed. OSD has gone down after some time >>> backfilling, found some errors in log, tried to start it again, but >>> same errors appeared and OSD went down. After completing this >>> bugreport I started OSD again out of curiosity and now it works ok. >>> During this time I have started another bluestore OSD for backfilling. >>> Nevertheless errors logged below. >>> >>> This seems rather like a bug in OSD not HDD problem. I do not see >>> SMART problems or any issues related to HDD in dmesg for this OSD. >>> >>> The log: >>> -3> 2017-10-08 16:56:31.861941 7f7481b4f700 -1 >>> bluestore(/var/lib/ceph/osd/ceph-3) _txc_add_transaction error (7) >>> Argument list too long not handled on operation 12 (op 2, counting >>> from 0) >>> -2> 2017-10-08 16:56:31.861960 7f7481b4f700 -1 >>> bluestore(/var/lib/ceph/osd/ceph-3) unexpected error code >>> -1> 2017-10-08 16:56:31.861962 7f7481b4f700 0 >>> bluestore(/var/lib/ceph/osd/ceph-3) transaction dump: >>> { >>> "ops": [ >>> { >>> "op_num": 0, >>> "op_name": "remove", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#" >>> }, >>> { >>> "op_num": 1, >>> "op_name": "touch", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#" >>> }, >>> { >>> "op_num": 2, >>> "op_name": "truncate", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#", >>> "offset": 15560314374 >> >> That's a 15GB object! What is this pool use for? Is there really an >> object that big, or is there a bug? >> >> Bluestore has a hard limit of ~4gb for objects. That is about 3 orders of >> magnitude higher than what we normally store so it not expected to be a >> problem... >> >> sage >> >> >> >>> }, >>> { >>> "op_num": 3, >>> "op_name": "op_setallochint", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#", >>> "expected_object_size": "0", >>> "expected_write_size": "0" >>> }, >>> { >>> "op_num": 4, >>> "op_name": "write", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#", >>> "length": 8388608, >>> "offset": 0, >>> "bufferlist length": 8388608 >>> }, >>> { >>> "op_num": 5, >>> "op_name": "setattrs", >>> "collection": "53.67_head", >>> "oid": >>> "#-55:e6289795:::temp_recovering_53.67_273646'437437_280059_head:head#", >>> "attr_lens": { >>> "_": 286, >>> "snapset": 31 >>> } >>> } >>> ] >>> } >>> >>> 0> 2017-10-08 16:56:31.871938 7f7481b4f700 -1 >>> /build/ceph-12.2.1/src/os/bluestore/BlueStore.cc: In function 'void >>> BlueStore::_txc_add_transaction(BlueStore::TransContext*, >>> ObjectStore::Transaction*)' thread 7f7481b4f700 time 2017-10-08 >>> 16:56:31.862077 >>> /build/ceph-12.2.1/src/os/bluestore/BlueStore.cc: 9299: FAILED >>> assert(0 == "unexpected error") >>> >>> ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) >>> luminous (stable) >>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x102) [0x55fc757a53f2] >>> 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, >>> ObjectStore::Transaction*)+0x1644) [0x55fc7565f404] >>> 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*, >>> std::vector<ObjectStore::Transaction, >>> std::allocator<ObjectStore::Transaction> >&, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) >>> [0x55fc75660a7e] >>> 4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, >>> ObjectStore::Transaction&&, Context*, Context*, Context*, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x17c) >>> [0x55fc7520dd8c] >>> 5: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, >>> boost::intrusive_ptr<OpRequest>)+0x58) [0x55fc75399598] >>> 6: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x391) >>> [0x55fc754a8a41] >>> 7: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2c1) >>> [0x55fc754b7a21] >>> 8: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) >>> [0x55fc753c7aa0] >>> 9: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, >>> ThreadPool::TPHandle&)+0x55d) [0x55fc7532c62d] >>> 10: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) >>> [0x55fc751af1a9] >>> 11: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> >>> const&)+0x57) [0x55fc75448ae7] >>> 12: (OSD::ShardedOpWQ::_process(unsigned int, >>> ceph::heartbeat_handle_d*)+0x130e) [0x55fc751d67de] >>> 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) >>> [0x55fc757aa1e4] >>> 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fc757ad220] >>> 15: (()+0x76ba) [0x7f749b6d36ba] >>> 16: (clone()+0x6d) [0x7f749a74a82d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- logging levels --- >>> 0/ 5 none >>> 0/ 1 lockdep >>> 0/ 1 context >>> 1/ 1 crush >>> 1/ 5 mds >>> 1/ 5 mds_balancer >>> 1/ 5 mds_locker >>> 1/ 5 mds_log >>> 1/ 5 mds_log_expire >>> 1/ 5 mds_migrator >>> 0/ 1 buffer >>> 0/ 1 timer >>> 0/ 1 filer >>> 0/ 1 striper >>> 0/ 1 objecter >>> 0/ 5 rados >>> 0/ 5 rbd >>> 0/ 5 rbd_mirror >>> 0/ 5 rbd_replay >>> 0/ 5 journaler >>> 0/ 5 objectcacher >>> 0/ 5 client >>> 1/ 5 osd >>> 0/ 5 optracker >>> 0/ 5 objclass >>> 1/ 3 filestore >>> 1/ 3 journal >>> 0/ 5 ms >>> 1/ 5 mon >>> 0/10 monc >>> 1/ 5 paxos >>> 0/ 5 tp >>> 1/ 5 auth >>> 1/ 5 crypto >>> 1/ 1 finisher >>> 1/ 5 heartbeatmap >>> 1/ 5 perfcounter >>> 1/ 5 rgw >>> 1/10 civetweb >>> 1/ 5 javaclient >>> 1/ 5 asok >>> 1/ 1 throttle >>> 0/ 0 refs >>> 1/ 5 xio >>> 1/ 5 compressor >>> 1/ 5 bluestore >>> 1/ 5 bluefs >>> 1/ 3 bdev >>> 1/ 5 kstore >>> 4/ 5 rocksdb >>> 4/ 5 leveldb >>> 4/ 5 memdb >>> 1/ 5 kinetic >>> 1/ 5 fuse >>> 1/ 5 mgr >>> 1/ 5 mgrc >>> 1/ 5 dpdk >>> 1/ 5 eventtrace >>> -2/-2 (syslog threshold) >>> -1/-1 (stderr threshold) >>> max_recent 10000 >>> max_new 1000 >>> log_file /var/log/ceph/ceph-osd.3.log >>> --- end dump of recent events --- >>> 2017-10-08 16:56:31.909899 7f7481b4f700 -1 *** Caught signal (Aborted) ** >>> in thread 7f7481b4f700 thread_name:tp_osd_tp >>> >>> ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) >>> luminous (stable) >>> 1: (()+0xa5a634) [0x55fc75762634] >>> 2: (()+0x11390) [0x7f749b6dd390] >>> 3: (gsignal()+0x38) [0x7f749a679428] >>> 4: (abort()+0x16a) [0x7f749a67b02a] >>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x28e) [0x55fc757a557e] >>> 6: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, >>> ObjectStore::Transaction*)+0x1644) [0x55fc7565f404] >>> 7: (BlueStore::queue_transactions(ObjectStore::Sequencer*, >>> std::vector<ObjectStore::Transaction, >>> std::allocator<ObjectStore::Transaction> >&, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) >>> [0x55fc75660a7e] >>> 8: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, >>> ObjectStore::Transaction&&, Context*, Context*, Context*, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x17c) >>> [0x55fc7520dd8c] >>> 9: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, >>> boost::intrusive_ptr<OpRequest>)+0x58) [0x55fc75399598] >>> 10: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x391) >>> [0x55fc754a8a41] >>> 11: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2c1) >>> [0x55fc754b7a21] >>> 12: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) >>> [0x55fc753c7aa0] >>> 13: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, >>> ThreadPool::TPHandle&)+0x55d) [0x55fc7532c62d] >>> 14: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) >>> [0x55fc751af1a9] >>> 15: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> >>> const&)+0x57) [0x55fc75448ae7] >>> 16: (OSD::ShardedOpWQ::_process(unsigned int, >>> ceph::heartbeat_handle_d*)+0x130e) [0x55fc751d67de] >>> 17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) >>> [0x55fc757aa1e4] >>> 18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fc757ad220] >>> 19: (()+0x76ba) [0x7f749b6d36ba] >>> 20: (clone()+0x6d) [0x7f749a74a82d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- begin dump of recent events --- >>> 0> 2017-10-08 16:56:31.909899 7f7481b4f700 -1 *** Caught signal >>> (Aborted) ** >>> in thread 7f7481b4f700 thread_name:tp_osd_tp >>> >>> ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) >>> luminous (stable) >>> 1: (()+0xa5a634) [0x55fc75762634] >>> 2: (()+0x11390) [0x7f749b6dd390] >>> 3: (gsignal()+0x38) [0x7f749a679428] >>> 4: (abort()+0x16a) [0x7f749a67b02a] >>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>> const*)+0x28e) [0x55fc757a557e] >>> 6: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, >>> ObjectStore::Transaction*)+0x1644) [0x55fc7565f404] >>> 7: (BlueStore::queue_transactions(ObjectStore::Sequencer*, >>> std::vector<ObjectStore::Transaction, >>> std::allocator<ObjectStore::Transaction> >&, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) >>> [0x55fc75660a7e] >>> 8: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, >>> ObjectStore::Transaction&&, Context*, Context*, Context*, >>> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x17c) >>> [0x55fc7520dd8c] >>> 9: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, >>> boost::intrusive_ptr<OpRequest>)+0x58) [0x55fc75399598] >>> 10: (ReplicatedBackend::_do_push(boost::intrusive_ptr<OpRequest>)+0x391) >>> [0x55fc754a8a41] >>> 11: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2c1) >>> [0x55fc754b7a21] >>> 12: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) >>> [0x55fc753c7aa0] >>> 13: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, >>> ThreadPool::TPHandle&)+0x55d) [0x55fc7532c62d] >>> 14: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) >>> [0x55fc751af1a9] >>> 15: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> >>> const&)+0x57) [0x55fc75448ae7] >>> 16: (OSD::ShardedOpWQ::_process(unsigned int, >>> ceph::heartbeat_handle_d*)+0x130e) [0x55fc751d67de] >>> 17: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) >>> [0x55fc757aa1e4] >>> 18: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55fc757ad220] >>> 19: (()+0x76ba) [0x7f749b6d36ba] >>> 20: (clone()+0x6d) [0x7f749a74a82d] >>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >>> needed to interpret this. >>> >>> --- logging levels --- >>> 0/ 5 none >>> 0/ 1 lockdep >>> 0/ 1 context >>> 1/ 1 crush >>> 1/ 5 mds >>> 1/ 5 mds_balancer >>> 1/ 5 mds_locker >>> 1/ 5 mds_log >>> 1/ 5 mds_log_expire >>> 1/ 5 mds_migrator >>> 0/ 1 buffer >>> 0/ 1 timer >>> 0/ 1 filer >>> 0/ 1 striper >>> 0/ 1 objecter >>> 0/ 5 rados >>> 0/ 5 rbd >>> 0/ 5 rbd_mirror >>> 0/ 5 rbd_replay >>> 0/ 5 journaler >>> 0/ 5 objectcacher >>> 0/ 5 client >>> 1/ 5 osd >>> 0/ 5 optracker >>> 0/ 5 objclass >>> 1/ 3 filestore >>> 1/ 3 journal >>> 0/ 5 ms >>> 1/ 5 mon >>> 0/10 monc >>> 1/ 5 paxos >>> 0/ 5 tp >>> 1/ 5 auth >>> 1/ 5 crypto >>> 1/ 1 finisher >>> 1/ 5 heartbeatmap >>> 1/ 5 perfcounter >>> 1/ 5 rgw >>> 1/10 civetweb >>> 1/ 5 javaclient >>> 1/ 5 asok >>> 1/ 1 throttle >>> 0/ 0 refs >>> 1/ 5 xio >>> 1/ 5 compressor >>> 1/ 5 bluestore >>> 1/ 5 bluefs >>> 1/ 3 bdev >>> 1/ 5 kstore >>> 4/ 5 rocksdb >>> 4/ 5 leveldb >>> 4/ 5 memdb >>> 1/ 5 kinetic >>> 1/ 5 fuse >>> 1/ 5 mgr >>> 1/ 5 mgrc >>> 1/ 5 dpdk >>> 1/ 5 eventtrace >>> -2/-2 (syslog threshold) >>> -1/-1 (stderr threshold) >>> max_recent 10000 >>> max_new 1000 >>> log_file /var/log/ceph/ceph-osd.3.log >>> --- end dump of recent events --- >>> >>> Best regards >>> Ugis >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Regards Kefu Chai -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html