Re: Bluestore crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You are right we are running ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) at the moment, if master has had substantial work done on it then it sounds like it is worth us retesting on that.

I fully appreciate that it is still a work in progress and that future performance may go down as well as up. That said, other than this one issue we have been impressed with the stability and not come across other significant bugs,

Thanks,

Thomas



From: Thomas Swindells <mole125@xxxxxxxxx>
To: Mark Nelson <mnelson@xxxxxxxxxx>; "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>
Sent: Thursday, 8 September 2016, 15:39
Subject: Re: Bluestore crashes

You are right we are running ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) at the moment, if master has had substantial work done on it then it sounds like it is worth us retesting on that.

I fully appreciate that it is still a work in progress and that future performance may go down as well as up. That said, other than this one issue we have been impressed with the stability and not come across other significant bugs,

Thanks,

Thomas



From: Mark Nelson <mnelson@xxxxxxxxxx>
To: ceph-users@xxxxxxxxxxxxxx
Sent: Thursday, 8 September 2016, 15:15
Subject: Re: [ceph-users] Bluestore crashes

It's important to keep in mind that bluestore is still rapidly being
developed.  At any given commit it might crash, eat data, be horribly
slow, destroy your computer, etc.  It's very much wild west territory.
Jewel's version of bluestore is quite different than what is in master
right now.  Please take any benchmarks with a big grain of salt since
there are several known issues we are working through (especially around
encode/decode).  Having said that, I'm glad to hear it's faster! :)

Mark

On 09/08/2016 08:19 AM, Wido den Hollander wrote:
>
>> Op 8 september 2016 om 14:58 schreef thomas.swindells@xxxxxxxxx:
>>
>>
>> We've been doing some performance testing on Bluestore to see whether it could be viable to use in the future.
>> The good news we are seeing significant performance improvements on using it, so thank you for all the work that has gone into it.
>> The bad news is we keep encountering crashes and corruption requiring the rebuild: Example log extract looks like the following:
>> 2016-09-03 17:56:57.337756 7f593fe9b700 -1 freelist release bad release 564521787392~4096 overlaps with 564521787392~40962016-09-03 17:56:57.340169 7f593fe9b700 -1 os/bluestore/FreelistManager.cc: In function 'int FreelistManager::release(uint64_t, uint64_t, KeyValueDB::Transaction)' thread 7f593fe9b700 time 2016-09-03 17:56:57.338393os/bluestore/FreelistManager.cc: 245: FAILED assert(0 == "bad release overlap")
>>  ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f59643945b5] 2: (FreelistManager::release(unsigned long, unsigned long, std::shared_ptr<KeyValueDB::TransactionImpl>)+0x533) [0x7f596402c963] 3: (BlueStore::_txc_update_fm(BlueStore::TransContext*)+0x317) [0x7f5963fda0d7] 4: (BlueStore::_kv_sync_thread()+0x9b0) [0x7f5964003690] 5: (BlueStore::KVSyncThread::entry()+0xd) [0x7f59640299dd] 6: (()+0x7dc5) [0x7f59622c4dc5] 7: (clone()+0x6d) [0x7f596095021d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>> --- begin dump of recent events ----10000> 2016-09-03 17:43:23.940910 7f593fe9b700  5 rocksdb: EmitPhysicalRecord: log 37 offset 70561038 len 23 crc 696327790 -9999> 2016-09-03 17:43:23.965781 7f593fe9b700  5 rocksdb: EmitPhysicalRecord: log 37 offset 70561061 len 2178 crc 2109984297 -9998> 2016-09-03 17:43:23.965833 7f593fe9b700  5 rocksdb: EmitPhysicalRecord: log 37 offset 70563239 len 2175 crc 493419836 -9997> 2016-09-03 17:43:23.965867 7f593fe9b700  5 rocksdb: EmitPhysicalRecord: log 37 offset 70565414 len 23 crc 1766806723...
>>
>
> Seems you are running Jewel (10.2.2) and the BlueStore version in there is not the best.
>
> If you want to test with BlueStore I recommend that you run with code from the master branch.
>
> Wido
>
>> This appears to match this existing bug: http://tracker.ceph.com/issues/15659
>> Are there any know work-arounds to prevent the issue from happening? What information and support can we provide to help the fix on the issue to be progressed? We seem to be able to reliably reproduce the issue after about 6 hours or so of test running so would be able to test any proposed fixes if that would be helpful,
>> Thanks,
>> Thomas_______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux