Several years ago Mark Kampe proposed doing something like this. I was
never totally convinced we could make something accurate enough quickly
enough for it to be useful.
If I were to attempt it, I would probably start out with a multiple
regression approach based on seemingly important configuration
parameters and results. I suspect with enough tweaking and an
admittedly large sample set (LOTS of ceph-brag runs) a model could be
built that would be moderately instructive for spinning disks. So long
as a reasonable per-OSD CPU/Memory ratio is maintained, spinning disk
performance is low enough and mostly static enough that minor code
change and new drive models probably won't ruin the model.
For SSD/NVMe there are just too many factors that can have a huge impact
on latency and performance. Look at the tcmalloc/jemalloc threadcache
results from last year and how huge of a performance impact that alone
can have for example. You might think: "ok, that's one parameter that
has a huge impact on performance (specifically small random write
performance), the model should be able to take that into account pretty
easily". The problem is that it's CPU dependent. Setups with few OSDs
and lots of CPU might not show any obvious performance advantage with
higher TCMalloc thread cache settings (but way lower CPU usage!) while
CPU limited setups might show huge performance advantages if they have
fast enough disks to make use of it. Not all versions of tcmalloc will
show the advantage. Older ones are buggy and don't improve performance
at all when you increase the threadcache. It also appears to be
primarily impactful with simplemessenger. When asyncmesseneger is used,
TCMalloc threadcache has almost no impact, but there are other
bottlenecks that impact asyncmessenger that don't impact
simplemessenger. Think of how many samples you'd need to even
semi-accurately predict the impact of tcmalloc threadcache. Now try to
do that when CPUs, SSDs, Network, Kernels, Ceph configurations, SSDs
with poor O_DSYNC performance, etc are all in play. By the time you
collected enough samples to say anything meaningful, the Ceph code and
the available hardware will have changed so much that your old samples
will be useless.
I think a better approach than trying to model all of this is to simply
spend our resources optimizing away as many of the variables as
possible. That's part of the idea behind bluestore. The more of the
stack we can remove and/or control, the more variables we can outright
eliminate and hopefully the model becomes simpler and more
straightforward. Even at that, things like RocksDB compaction and
read/write amp still have a huge effect. This is why there have been so
many discussions lately on ceph-devel around rocksdb settings, bluestore
onode size reduction, encode/decode changes, etc.
Mark
On 07/22/2016 07:34 PM, EP Komarla wrote:
Team,
Have a performance related question on Ceph.
I know performance of a ceph cluster depends on so many factors like
type of storage servers, processors (no of processor, raw performance of
processor), memory, network links, type of disks, journal disks, etc.
On top of the hardware features, it is also influenced by the type of
operation you are doing like seqRead, seqWrite, blocksize, etc., etc.
Today one way we demonstrate performance is using benchmarks and test
configurations. As a result, it is difficult to compare performance
without understanding the underlying system and the usecases.
Now coming to my question. Is there a Ceph performance calculator, that
takes all (or some) of these factors and gives out an estimate of the
performance you can expect for different scenarios? I was asked this
question, I didn’t know how to answer this question, I thought of
checking with the wider user group to see if someone is aware of such a
tool or knows how to do this calculation. Any pointers will be appreciated.
Thanks,
- epk
Legal Disclaimer:
The information contained in this message may be privileged and
confidential. It is intended to be read only by the individual or entity
to whom it is addressed or by their designee. If the reader of this
message is not the intended recipient, you are on notice that any
distribution of this message, in any form, is strictly prohibited. If
you have received this message in error, please immediately notify the
sender and delete or destroy any copy of this message!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com