RE: Bluestore OSD support in ceph-disk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Please send the snippet (very first trace , go up in the log) where it is actually printing the assert.
BTW, what workload you are running ?

Thanks & Regards
Somnath

-----Original Message-----
From: Kamble, Nitin A [mailto:Nitin.Kamble@xxxxxxxxxxxx]
Sent: Friday, September 16, 2016 11:38 AM
To: Sage Weil
Cc: Somnath Roy; Ceph Development
Subject: Re: Bluestore OSD support in ceph-disk


> On Sep 15, 2016, at 11:43 PM, Kamble, Nitin A <Nitin.Kamble@xxxxxxxxxxxx> wrote:
>
>>
>> On Sep 15, 2016, at 11:54 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>
>>
>> The 128MB figure is mostly pulled out of a hat.  I suspect it will be
>> reasonable, but a proper recommendation is going to depend on how we
>> end up tuning rocksdb, and we've put that off until the metadata
>> format is finalized and any rocksdb tuning we do will be meaningful.
>> We're pretty much at that point now...
>>
>> Whatever it is, it should be related to the request rate, and perhaps
>> the relative speed of the wal device and the db or main device.  The
>> size of the slower devices shouldn't matter, though.
>>
>> There are some bluefs perf counters that let you monitor what the wal
>> device utilization is.  See
>>
>> b.add_u64(l_bluefs_wal_total_bytes, "wal_total_bytes",
>>     "Total bytes (wal device)");
>> b.add_u64(l_bluefs_wal_free_bytes, "wal_free_bytes",
>>     "Free bytes (wal device)");
>>
>> which you can monitor via 'ceph daemon osd.N perf dump'.  If you
>> discover anything interesting, let us know!
>>
>> Thanks-
>> sage
>
> I could build and deploy the latest master (commit: 9096ad37f2c0798c26d7784fb4e7a781feb72cb8) with partitioned bluestore. I struggled a bit to bring up OSDs as the available documentation for bringing up the partitioned bluestore OSDs is mostly primitive so far. Once ceph-disk gets updated this pain will go away. We will stress the cluster shortly, but so far I am delighted to see that from ground-zero it is able stand up on it’s own feet to HEALTH_OK without any errors. If I see any issues in our tests I will share it here.
>
> Thanks,
> Nitin

Out of 30 OSDs, one failed after stress of 1.5hrs. Rest of the 29 OSDs are holding on fine for many hours.
If needed I can provide the executable or objdump.


 ceph version v11.0.0-2309-g9096ad3 (9096ad37f2c0798c26d7784fb4e7a781feb72cb8)
 1: (()+0x892dd2) [0x7fb5bf2b8dd2]
 2: (()+0xf890) [0x7fb5bb4ae890]
 3: (gsignal()+0x37) [0x7fb5ba270187]
 4: (abort()+0x118) [0x7fb5ba271538]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x265) [0x7fb5bf43a2f5]
 6: (BlueFS::_allocate(unsigned char, unsigned long, std::vector<bluefs_extent_t, std::allocator<bluefs_extent_t> >*)+0x8ad) [0x7fb5bf2735dd]
 7: (BlueFS::_flush_and_sync_log(std::unique_lock<std::mutex>&, unsigned long, unsigned long)+0xb4f) [0x7fb5bf27aa1f]
 8: (BlueFS::_fsync(BlueFS::FileWriter*, std::unique_lock<std::mutex>&)+0x29b) [0x7fb5bf27bc9b]
 9: (BlueRocksWritableFile::Sync()+0x4e) [0x7fb5bf29125e]
 10: (rocksdb::WritableFileWriter::SyncInternal(bool)+0x139) [0x7fb5bf388699]
 11: (rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x7fb5bf389238]
 12: (rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool)+0x13cf) [0x7fb5bf2e0a2f]
 13: (rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x27) [0x7fb5bf2e1637]
 14: (RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x5b) [0x7fb5bf21a14b]
 15: (BlueStore::_kv_sync_thread()+0xf5a) [0x7fb5bf1e7ffa]
 16: (BlueStore::KVSyncThread::entry()+0xd) [0x7fb5bf1f5a6d]
 17: (()+0x80a4) [0x7fb5bb4a70a4]
 18: (clone()+0x6d) [0x7fb5ba32004d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Thanks,
Nitin


PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux