Mark,
Thanks for the detailed explanation and example. This is exactly what I was looking for.
Best,
Henry Ngo
On Tue, May 2, 2017 at 9:29 AM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
Hi Henry,
The recovery test mechanism is basically a state machine launched in another thread that runs concurrently during whatever benchmark you want to run. The basic premise is that it waits for a configurable amount of "pre" time to let the benchmarks get started, then marks osd down/out, waits until the cluster is healthy, then marks them up/in, and waits until they are healthy again. This happens while your chosen background load is runs. At the end, there is a post phase where you can specify how long you would like the benchmark to continue running after the recovery process has completed. ceph health is run every second during this process and recorded in a log to keep track of what's happening while the tests are running.
Typically once the recovery test is complete, a callback in the benchmark module is made to let the benchmark know the recovery test is done. Usually this will kill the benchmark (ie you might choose to run a 4 hour fio test and then let the recovery process inform the fio benchmark module kill fio). Alternately, you can tell it to keep repeating the process until the benchmark itself completes with the "repeat" option.
The actual yaml to do this is quite simple. Simply put a "recovery_test" section in your cluster section, tell it which OSDs you want to mark down, and optionally give it repeat, pre_time, and post_time options.
Here's an example:
recovery_test:
osds: [3,6]
repeat: True
pre_time: 60
post_time: 60
Here's a paper where this functionality was actually used to predict how long our thrashing tests in the ceph QA lab would take based on HDDs/SSDs. We knew our thrashing tests were using most of the time in the lab and we were able to use this to determine how much buying SSDs would speed up the QA runs.
https://drive.google.com/open?id=0B2gTBZrkrnpZYVpPb3VpTkw5aF k
See appendix B for the ceph.conf that was used at the time for the tests. Also, please do not use the "-n size=64k" mkfs.xfs option in that yaml file. We later found out that it can cause XFS to deadlock and may not be safe to use.
Mark
On 05/02/2017 10:58 AM, Henry Ngo wrote:
____________________________________________________________Hi all,
CBT documentation states that this can be achieved. If so, how do I set
it up? What do I add in the yaml file? Below is an EC example. Thanks.
cluster:
head:"ceph@head"
clients:["ceph@client"]
osds:["ceph@osd"]
mons:["ceph@mon"]
osds_per_node:1
fs:xfs
mkfs_opts:-f -i size=2048
mount_opts:-o inode64,noatime,logbsize=256k
conf_file:/home/ceph/ceph-tools/cbt/example/ceph.conf
ceph.conf:/home/ceph/ceph-tools/cbt/example/ceph.conf
iterations:3
rebuild_every_test:False
tmp_dir:"/tmp/cbt"
pool_profiles:
erasure:
pg_size:4096
pgp_size:4096
replication:'erasure'
erasure_profile:'myec'
benchmarks:
radosbench:
op_size:[4194304, 524288, 4096]
write_only:False
time:300
concurrent_ops:[128]
concurrent_procs:1
use_existing:True
pool_profile:erasure
_________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com