On 09/09/2018 10:08, Xiangyang Yu wrote:
Since in jewel 10.2.11,the default filestore omap db is rocksdb,it's
not encouraged to compile with jemalloc.
Right?
If it's right, in Jewel the default msg is still simple msg,jemalloc
performs better. So in jewel 10.2.11,
it's better to use leveldb for both monitor and omap db.
Right?
Well, bad performance is a concern of course.
Crashing programs is a way bigger problem.
But I'll keep it in the back of my head.
--WjW
Sage Weil <sage@xxxxxxxxxxxx>于2018年9月8日 周六01:02写道:
On Fri, 7 Sep 2018, Willem Jan Withagen wrote:
On 07/09/2018 14:28, Sage Weil wrote:
On Fri, 7 Sep 2018, Xiangyang Yu wrote:
Hi all,
In our production cluster, we use jewel 10.2.10. We use jemalloc to
allocate memory.
These days we are trying to add rocksdb support to osd and monitor,
Be aware that there is a known problem with rocksdb and jemalloc that
causes a crash; see http://tracker.ceph.com/issues/20557
That appears to be a different issue than the compilation problem you are
seeing. Assuming you get past that, though, I would expect you to hit the
#20557 bug anyway.
Starting with luminous we've recommended users stop using jemalloc because
the switch to AsyncMessenger wipes away the benefit users were seeing in
jewel; tcmalloc and jemalloc now perform about the same (when jemalloc
isn't crashing :).
'mmmm,
Made me start to wonder why the FreeBSD version had not been bitten by this?
Since jemalloc is the default malloc there.
But everything is linked against libtcmalloc.so, which could/should prevent
that.
Is this codepath typical for Bluestore, and not for filestore??
Since that would be the other explanation.
Right.. this only seems to happen with rocksdb, and thus with luminous.
(In luminous we also switched the mons to default to rocksdb.)
But then again I submitted some fixes to rocksdb to get the handling of
jemalloc being in libc including/linking the right way. So could be that
malloc in rocksdb then started using the libc malloc (aka the jemalloc
variant)
Do we know what the bug in the rocksdb/jemalloc combination is? Or was it
solved by going to tcmalloc, without in depth understanding wat was going on?
Right.. I've no idea what the actual problem is. Disabling jemalloc fixes
it (most users who hit this had an /etc/{default,sysconfig}/ceph entry
preloading jemalloc) and jemalloc doesn't offer the same performance boost
that it did on jewel, so we didn't investigate.
s
Putting it on the list to figure out.
--WjW
sage
I have merged the commit below but failed to compile the code,
https://github.com/ceph/ceph/pull/18010
https://github.com/ceph/ceph/pull/18010
The screen shows :
src/rocksdb/db/db_impl.cc:401 undefined reference to
'malloc_stats_print'
Make[3] : [ceph_test_keyvaluedb] ERROR 1
src/rocksdb/db/db_impl.cc:401 undefined reference to
'malloc_stats_print'
Make[3] : [ceph_osdmap_tool] ERROR 1
src/rocksdb/db/db_impl.cc:401 undefined reference to
'malloc_stats_print'
Make[3] : [ceph_kvstore_tool] ERROR 1
Then I find some related commit which merged in Lumious:
cmake: should link against ${ALLOC_LIBS}
https://github.com/ceph/ceph/pull/11978/files
But this commit did not resolve my problem, there errors still exists.
But when I compile 10.2.11 , no errors show, it's very surprising. ALL
makefile seems the same.
I have spended one day to solve the problem with no outcome.
I must miss some commits, Anyone has some clues?
Best wished,
brandy