Re: Hardware profiling and AI in Ceph

John Spray <jspray@xxxxxxxxxx> · Wed, 10 Oct 2018 10:52:12 +0100

On Wed, Oct 10, 2018 at 8:44 AM Frédéric Nass
<frederic.nass@xxxxxxxxxxxxxxxx> wrote:
>
> Hello everyone,
>
> Sorry for raising questions without first reading the previous 10K+
> unread messages :-). I was wondering if there had been any discussions
> regarding:
>
> - Qualifying common hardware from x86_64 manufacturers to create
> performance profiles (networking, kernel, osd). These profiles would
> help to get the best out of the hardware based on the configuration of
> each node.

This has come up many times, which is probably a sign that it's a good
idea and someone should do it :-)  We already have some progress in
this direction with the distinct SSD vs HDD settings for certain OSD
parameters.  Erwan also discussed working on some related tooling on a
ceph-devel thread ("Presenting the pre-flight checks project").  We
also have the "ceph-brag" tool/leaderboard concept, I'm not sure what
the status of that is.

But we shouldn't get too hung up on the automation of this -- even
blog posts that describe hardware setups and associated tuning are
super useful resources.

> - A minimalist CephOS that would help with the tweaking and performances.

Supporting a whole operating system is hard work -- where there are
specific changes we'd like at the OS level, the best thing is to try
and get them upstream into the commodity linux distros.  Your
favourite Ceph/distro vendors already try to do this, across the OS
and storage products.

Ceph is also often installed on nodes that are multi-purpose (not pure
storage hardware), so we should aim for solutions that don't rely on a
completely controlled storage specific environment.

> - Metrics and logging from OSDs that would show when an OSD reaches a
> configuration limit that makes it turn thumbs.

Since we already have so many performance counters from OSDs, I think
it would be interesting to try writing something like this based on
the existing infrastructure.  ceph-mgr modules have access to
performance counters (though you might have to adjust
mgr_stats_threshold to see everything), so it could be reasonable
simple to write some python code that notices when throughput is stuck
at some prescribed limit.

John

> These questions came to me after I spent hours trying to get decent
> figures out of full SSD nodes while the host CPU and iostat wouldn't
> exceed 30% and 60 %util . (RHCS support case #02195389)
> I had to disabling WBThrottler, set filestore_queue_max_ops=500000 and
> filestore_queue_max_bytes=1048576000.
>
> I understand that tweaking really depends on workloads, but it would be
> nice if the OSD could adapt its configuration to hardware (network
> latency, mixed drive technologies or not, number of cores vs number of
> OSDs and Ghz,etc.) and then workloads.
> After using device classes, I guess this could be AI from machine
> learning coming into Ceph. As an admin, I'm always wondering if my
> hardware is not weak or if I didn't miss any
> hunder-the-hood-never-heard-about-what-does-that-even-do OSD option.
>
> Sorry again for not reading previous posts and not watching all ceph
> performance weekly videos. ;-)
>
> Best regards,
>
> Frédéric
>
> --
>
> Frédéric Nass
>
> Sous-direction Infrastructures
> Direction du Numérique
> Université de Lorraine
>
> Tél : +33 3 72 74 11 35
>
>