Somnath, Please take a look at issue #16662, there's a full analysis there of the problem and where the bug lies. But essentially, you point it out below; the bitmap allocator has a max per-allocation limit of 4MB, but the code below does not break up the 8 MB allocation into 4MB chunks because the value of max_alloc_size defaults to zero if not specified in ceph.conf. So the bitmap allocator returns -ENOSPC and the assert fires. Cheers, Kevan On 7/12/16, 1:13 PM, "Somnath Roy" <Somnath.Roy@xxxxxxxxxxx> wrote: >May be allocator is not able to handle bigger request than 4MB, Ramesh ? > > > uint64_t want = max_alloc_size ? MIN(final_length, max_alloc_size) >: final_length; > int r = alloc->allocate(want, min_alloc_size, hint, > &e.offset, &l); > assert(r == 0); > >Kevan, >Could you please try setting the following in the ceph.conf and reproduce >? You need to recreate the cluster. > >bluestore_allocator = stupid > >Thanks & Regards >Somnath > >-----Original Message----- >From: ceph-devel-owner@xxxxxxxxxxxxxxx >[mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Mark Nelson >Sent: Tuesday, July 12, 2016 11:03 AM >To: Kevan Rehm; ceph-devel (ceph-devel@xxxxxxxxxxxxxxx) >Subject: Re: bluestore asserts if I/O size > 4MB when using SSDs > >Hi Kevan, > >There's been a lot of big changes in the last couple of months so we're >slowly working on getting things back into a stable state. :) Thanks >for the report! > >Mark > >On 07/12/2016 12:56 PM, Kevan Rehm wrote: >> Greetings, >> >> I have opened issue #16662 for a bluestore assert that happens when >> using SSDs with I/O sizes larger than 4 MB. Below is the traceback. >>Inquiring >> minds can look at the issue for an analysis of what is happening. If >>you >> set 'bluestore_max_alloc_size' to 4194304 in ceph.conf, then you can >> avoid this problem. >> >> Cheers, Kevan >> >> 0> 2016-07-12 15:37:03.377685 7f47731a7700 -1 >> os/bluestore/BlueStore.cc: In function 'int >> BlueStore::_do_alloc_write(BlueStore::TransContext*, >> BlueStore::WriteContext*)' thread 7f47731a7700 time 2016-07-12 >> 15:37:03.368821 >> os/bluestore/BlueStore.cc: 5988: FAILED assert(r == 0) >> >> ceph version 11.0.0-196-g85bb43e >> (85bb43e111692989d2296a389ce45377d2297d6f) >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> const*)+0x85) [0x7f479212de35] >> 2: (BlueStore::_do_alloc_write(BlueStore::TransContext*, >> BlueStore::WriteContext*)+0x10af) [0x7f4791d48c4f] >> 3: (BlueStore::_do_write(BlueStore::TransContext*, >> boost::intrusive_ptr<BlueStore::Collection>&, >> boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, >> ceph::buffer::list&, unsigned int)+0x3e9) [0x7f4791d69479] >> 4: (BlueStore::_write(BlueStore::TransContext*, >> boost::intrusive_ptr<BlueStore::Collection>&, >> boost::intrusive_ptr<BlueStore::Onode>&, unsigned long, unsigned long, >> ceph::buffer::list&, unsigned int)+0x10f) [0x7f4791d6a0bf] >> 5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, >> ObjectStore::Transaction*)+0x1127) [0x7f4791d6e297] >> 6: (BlueStore::queue_transactions(ObjectStore::Sequencer*, >> std::vector<ObjectStore::Transaction, >> std::allocator<ObjectStore::Transaction> >&, >> std::shared_ptr<TrackedOp>, >> ThreadPool::TPHandle*)+0x41a) [0x7f4791d7017a] >> 7: >> (ReplicatedPG::queue_transactions(std::vector<ObjectStore::Transaction >> , std::allocator<ObjectStore::Transaction> >&, >> std::shared_ptr<OpRequest>)+0x8c) [0x7f4791bffc4c] >> 8: (ReplicatedBackend::submit_transaction(hobject_t const&, >> eversion_t const&, std::unique_ptr<PGBackend::PGTransaction, >> std::default_delete<PGBackend::PGTransaction> >&&, eversion_t const&, >> eversion_t const&, std::vector<pg_log_entry_t, >> std::allocator<pg_log_entry_t> > const&, >> boost::optional<pg_hit_set_history_t>&, Context*, Context*, Context*, >> unsigned long, osd_reqid_t, std::shared_ptr<OpRequest>)+0x94d) >> [0x7f4791c3b67d] >> 9: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*, >> ReplicatedPG::OpContext*)+0x6af) [0x7f4791b8ba5f] >> 10: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0xeea) >> [0x7f4791bdfeda] >> 11: (ReplicatedPG::do_op(std::shared_ptr<OpRequest>&)+0x2843) >> [0x7f4791be3853] >> 12: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, >> ThreadPool::TPHandle&)+0x777) [0x7f4791b9e957] >> 13: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >> std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x41d) >> [0x7f4791a4d0ad] >> 14: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> >> const&)+0x6d) [0x7f4791a4d2fd] >> 15: (OSD::ShardedOpWQ::_process(unsigned int, >> ceph::heartbeat_handle_d*)+0x864) [0x7f4791a6e884] >> 16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x947) >> [0x7f4792112147] >> 17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >> [0x7f47921142a0] >> 18: (()+0x7dc5) [0x7f478ffd7dc5] >> 19: (clone()+0x6d) [0x7f478e1efced] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is >> needed to interpret this. >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >> in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo >> info at http://vger.kernel.org/majordomo-info.html >> >-- >To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at > http://vger.kernel.org/majordomo-info.html >PLEASE NOTE: The information contained in this electronic mail message is >intended only for the use of the designated recipient(s) named above. If >the reader of this message is not the intended recipient, you are hereby >notified that you have received this message in error and that any >review, dissemination, distribution, or copying of this message is >strictly prohibited. If you have received this communication in error, >please notify the sender by telephone or e-mail (as shown above) >immediately and destroy any and all copies of this message in your >possession (whether hard copies or electronically stored copies). -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html