Hi,
Currently, we do not have a separated cluster network and our setup is:
- 3 nodes for OSD with 1Gbps links. Each node is running a unique OSD daemon. Although we plan to increase the number of OSDs per host.
- 3 virtual machines also with 1Gbps links, where each vm is running one monitor daemon (two of them are running a metadata server too).
- The two clients used for testing purposes are also 2 vms.
In each run of FIO tool, we do the following steps (all of them in the client):
1.- Create an rbd image of 1Gb within a pool and map this image to a block device
2.- Create the ext4 filesystem in this block device
3.- Unmap the device from the client
4.- Before testing, drop caches (echo 3 | tee /proc/sys/vm/drop_caches && sync)
5.- Perform the fio test, setting the pool and name of the rbd image. In each run, the block size used is changed.
6.- Remove the image from the pool
Thanks in advance!
On Wed, Oct 5, 2016 at 2:57 PM, Will.Boege <Will.Boege@xxxxxxxxxx> wrote:
What does your network setup look like? Do you have a separate cluster network?
Can you explain how you are performing the FIO test? Are you mounting a volume through krbd and testing that from a different server?Hello,
We are setting a new cluster of Ceph and doing some benchmarks on it.At this moment, our cluster consists of:- 3 nodes for OSD. In our current configuration one daemon per node.- 3 nodes for monitors (MON). In two of these nodes, there is a metadata server (MDS).
Benchmarks are performed with tools that ceph/rados provides us as well as with fio benchmark tool.Our benchmark tests are based on this tutorial: http://tracker.ceph.com/projects/ceph/wiki/Benchmark_Ceph_Cl .uster_Performance
Using fio benchmark tool, we are having some issues. After some executions, the fio process gets stuck with futex_wait_queue_me call:# cat /proc/14413/stack[<ffffffffa7af6622>] futex_wait_queue_me+0xd2/0x140[<ffffffffa7af74bf>] futex_wait+0xff/0x260[<ffffffffa7aa3a6d>] wake_up_q+0x2d/0x60[<ffffffffa7af7d11>] futex_requeue+0x2c1/0x930[<ffffffffa7af8fd1>] do_futex+0x2b1/0xb20[<ffffffffa7badfb1>] handle_mm_fault+0x14e1/0x1cd0[<ffffffffa7aa48e8>] wake_up_new_task+0x108/0x1a0[<ffffffffa7af98c3>] SyS_futex+0x83/0x180[<ffffffffa7a63981>] __do_page_fault+0x221/0x510[<ffffffffa7fda736>] system_call_fast_compare_end+0xc/0x96 [<ffffffffffffffff>] 0xffffffffffffffff
Logs of osd and mon daemons do not show any information or error about what the problem could be.
Executing strace command to trace the execution of the fio process show the following:
[pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632809, {1475609725, 98199000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) [pid 14416] gettimeofday({1475609725, 98347}, NULL) = 0[pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0[pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 345690227}) = 0 [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632811, {1475609725, 348199000}, ffffffff <unfinished ...> [pid 14429] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)[pid 14429] clock_gettime(CLOCK_REALTIME, {1475609725, 127563261}) = 0[pid 14429] futex(0x7cefc8, FUTEX_WAKE_PRIVATE, 1) = 0[pid 14429] futex(0x7cf01c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 79103, {1475609727, 127563261}, ffffffff <unfinished ...> [pid 14416] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)[pid 14416] gettimeofday({1475609725, 348403}, NULL) = 0[pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0[pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 595788486}) = 0 [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632813, {1475609725, 598199000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) [pid 14416] gettimeofday({1475609725, 598360}, NULL) = 0[pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0[pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125063, 845712817}) = 0 [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632815, {1475609725, 848199000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) [pid 14416] gettimeofday({1475609725, 848353}, NULL) = 0[pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0[pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125064, 95705677}) = 0 [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632817, {1475609726, 98199000}, ffffffff) = -1 ETIMEDOUT (Connection timed out) [pid 14416] gettimeofday({1475609726, 98359}, NULL) = 0[pid 14416] futex(0x7fffdffa16d0, FUTEX_WAKE, 1) = 0[pid 14416] clock_gettime(CLOCK_MONOTONIC_RAW, {125064, 345711731}) = 0 [pid 14416] futex(0x7fffdffa16fc, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 632819, {1475609726, 348199000}, ffffffff <unfinished ...> [pid 14418] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)[pid 14418] futex(0x7c1f08, FUTEX_WAKE_PRIVATE, 1) = 0[pid 14418] clock_gettime(CLOCK_REALTIME, {1475609726, 103526543}) = 0[pid 14418] futex(0x7c1f5c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 31641, {1475609731, 103526543}, ffffffff <unfinished ...> [pid 14419] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
....
[pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 730557149}) = 0[pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 730727417}) = 0[pid 14423] futex(0x7c8c34, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0x7c8b60, 15902 <unfinished ...>[pid 14425] <... futex resumed> ) = 0[pid 14425] futex(0x7c8b60, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>[pid 14423] <... futex resumed> ) = 1[pid 14423] futex(0x7c8b60, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>[pid 14425] <... futex resumed> ) = 0[pid 14425] futex(0x7c8b60, FUTEX_WAKE_PRIVATE, 1) = 0[pid 14425] clock_gettime(CLOCK_REALTIME, {1475609728, 731160249}) = 0[pid 14425] sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"\16", 1}, {"\200\4\364W\271\236\224+", 8}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 9[pid 14425] futex(0x7c8c34, FUTEX_WAIT_PRIVATE, 15903, NULL <unfinished ...>[pid 14423] <... futex resumed> ) = 1[pid 14423] clock_gettime(CLOCK_REALTIME, {1475609728, 731811246}) = 0[pid 14423] futex(0x775430, FUTEX_WAKE_PRIVATE, 1) = 0[pid 14423] futex(0x775494, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 15823, {1475609738, 731811246}, ffffffff <unfinished ...> [pid 14426] <... restart_syscall resumed> ) = 1[pid 14426] recvfrom(3, "\17\200\4\364W\271\236\224+", 4096, MSG_DONTWAIT, NULL, NULL) = 9[pid 14426] clock_gettime(CLOCK_REALTIME, {1475609728, 732608460}) = 0[pid 14426] poll([{fd=3, events=POLLIN|0x2000}], 1, 900000 <unfinished ...>[pid 14417] <... futex resumed> ) = 0[pid 14417] futex(0x771e28, FUTEX_WAKE_PRIVATE, 1) = 0[pid 14417] futex(0x771eac, FUTEX_WAIT_PRIVATE, 32223, NULL <unfinished ...>[pid 14416] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
This issue has appeared in our two clients. These two clients are running Debian Jessie, each one with a different kernel:- kernel 3.16.7-ckt25-2+deb8u3
- kernel 4.7.2-1~bpo8+1And the following version of the packages have been used in both clients:- Ceph cluster 10.2.2 & FIO 2.1.11-2- Ceph cluster 10.2.3 & FIO 2.1.11-2- Ceph cluster 10.2.3 & FIO 2.14
We launch fio tool varying different settings such block size and operation type.
This is a simplified snippet of the shell script used:
for operation in read write randread randwrite; dofor rbd in 4K 64K 1M 4M; dofor bs in 4k 64k 1M 4M ; do# create rbd image with block size $rbd# drop caches
fio --name=global \
--ioengine=rbd \--clientname=admin \--pool=scbench \--rbdname=image01 \--bs=${bs} \--name=rbd_iodeph32 \
--iodepth=32 \--rw=${operation} \--output-format=json
sleep 10
# delete rbd imagedone
donedone
Any ideas why it could be happening ? Are we missing some settings in fio tool ?
Regards,
--
Mario Rodríguez
SRE
mariorodriguez@xxxxxxxxxx
+34 914 294 039 — 645 756 437
C/ Gran Vía, nº 28, 6ª planta — 28013 Madrid
Tuenti Technologies, S.L.
www.tuenti.com_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
Mario Rodríguez
SRE
mariorodriguez@xxxxxxxxxx
+34 914 294 039 — 645 756 437
C/ Gran Vía, nº 28, 6ª planta — 28013 Madrid
Tuenti Technologies, S.L.
www.tuenti.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com