Re: osd crashes with large object size (>10GB) in luminos Rados

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nick,

Thanks, I will look into the latest bareos version.  They did mention libradosstriper on github.

There is another question.  On jewel I have 25GB size objects.  Once I upgrade to luminous those objects will be "out of bounds".
1. Will OSD start and Will I be able to read them?
2. Will they chop themselves into little pieces automatically or do I need to get -- put_back them?

Thank you,
Alexander



On Tue, Sep 26, 2017 at 4:29 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:

Bareos needs to be re-written to use libradosstriper or it should internally shard the data across multiple objects. Objects shouldn’t be stored as large as that and performance will also suffer.

 

From: ceph-users [mailto:ceph-users-bounces@lists.ceph.com] On Behalf Of Alexander Kushnirenko
Sent: 26 September 2017 13:50
To: ceph-users@xxxxxxxxxxxxxx
Subject: osd crashes with large object size (>10GB) in luminos Rados

 

Hello,

 

We successfully use rados to store backup volumes in jewel version of CEPH. Typical volume size is 25-50GB.  Backup software (bareos) use Rados objects as backup volumes and it works fine.  Recently we tried luminous for the same purpose.

 

In luminous developers reduced osd_max_object_size from 100G to 128M.  As I understood for the performance reasons.  But it broke down interaction with bareos backup software.  You can reverse osd_max_object_size to 100G, but then the OSD start to crash once you start to put objects of about 4GB in size (4,294,951,051).

 

Any suggestion how to approach this problem?

 

Alexander.

 

Sep 26 15:12:58 ceph02 ceph-osd[1417]: /build/ceph-12.2.0/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread 7f04ac2f9700 time 2017-09-26 15:12:58.230268

Sep 26 15:12:58 ceph02 ceph-osd[1417]: /build/ceph-12.2.0/src/os/bluestore/BlueStore.cc: 9282: FAILED assert(0 == "unexpected error")

Sep 26 15:12:58 ceph02 ceph-osd[1417]: 2017-09-26 15:12:58.229837 7f04ac2f9700 -1 bluestore(/var/lib/ceph/osd/ceph-0) _txc_add_transaction error (7) Argument list too long not handled on operation 10 (op 1, counting from 0)

Sep 26 15:12:58 ceph02 ceph-osd[1417]: 2017-09-26 15:12:58.229869 7f04ac2f9700 -1 bluestore(/var/lib/ceph/osd/ceph-0) unexpected error code

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x563c7b5f83a2]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x15fa) [0x563c7b4ac2ba]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  3: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x536) [0x563c7b4ad916]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  4: (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x66) [0x563c7b1d17f6]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  5: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0xcbf) [0x563c7b30436f]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  6: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0x9fa) [0x563c7b16d68a]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  7: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x131d) [0x563c7b1b7a5d]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  8: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2ece) [0x563c7b1bb26e]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  9: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xea6) [0x563c7b175446]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  10: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ab) [0x563c7aff919b]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  11: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x5a) [0x563c7b29154a]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x103d) [0x563c7b01fd9d]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8ef) [0x563c7b5fd20f]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  14: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x563c7b600510]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  15: (()+0x7494) [0x7f04c56e2494]

Sep 26 15:12:58 ceph02 ceph-osd[1417]:  16: (clone()+0x3f) [0x7f04c4769aff]

 



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux