Fwd: large concurrent rbd operations block for over 15 mins!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Apparently the graph is too big, so my last post is stuck. Resending without the graph. 

Thanks 


---------- Forwarded message ---------
From: Void Star Nill <void.star.nill@xxxxxxxxx>
Date: Mon, Oct 21, 2019 at 4:41 PM
Subject: large concurrent rbd operations block for over 15 mins!
To: ceph-users <ceph-users@xxxxxxxxxxxxxx>


Hello,

I have been running some benchmark tests with a mid-size cluster and I am seeing some issues. Wanted to know if this is a bug or something that can be tuned. Appreciate any help on this.

- I have a 15 node Ceph cluster, with 3 monitors and 12 data nodes with total 61 OSDs on SSDs running 14.2.4 nautilus (stable) version. Each node has 100G link.
- I have 245 client machines from which I am triggering rbd operations. Each client has 25G link
- rbd operations include, creating an RBD image of 50G size and layering feature, mapping the image to the client machine, formatting the device in ext4 format, mounting it, running dd to write to the full disk and cleaning up (unmount, unmap and remove).

If I run these RBD operations concurrently on a small number of machines (say 16-20), they run very well and I see good throughput. All image operations (except for dd) take less than 2 seconds.

However, when I scale it up to 245 clients, each running these operations concurrently, I see lot of operations getting hung for a long time and the overall throughput reduces drastically.

For example, some of the format operations take over 10-15 mins!!!

Note that, all operations do complete - so its most likely not a deadlock kind of situation.

I dont see any errors in ceph.log on the monitor nodes. However, the clients do report "hung_task_timeout" in dmesg logs.

As you can see in the below image, half the format operations are completing in less than a second time, while the other half is over 10mins (y axis is in seconds)



[11117.113618] INFO: task umount:9902 blocked for more than 120 seconds.
[11117.113677]       Tainted: G           OE    4.15.0-51-generic #55~16.04.1-Ubuntu
[11117.113731] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11117.113787] umount          D    0  9902   9901 0x00000000
[11117.113793] Call Trace:
[11117.113804]  __schedule+0x3d6/0x8b0
[11117.113810]  ? _raw_spin_unlock_bh+0x1e/0x20
[11117.113814]  schedule+0x36/0x80
[11117.113821]  wb_wait_for_completion+0x64/0x90
[11117.113828]  ? wait_woken+0x80/0x80
[11117.113831]  __writeback_inodes_sb_nr+0x8e/0xb0
[11117.113835]  writeback_inodes_sb+0x27/0x30
[11117.113840]  __sync_filesystem+0x51/0x60
[11117.113844]  sync_filesystem+0x26/0x40
[11117.113850]  generic_shutdown_super+0x27/0x120
[11117.113854]  kill_block_super+0x2c/0x80
[11117.113858]  deactivate_locked_super+0x48/0x80
[11117.113862]  deactivate_super+0x5a/0x60
[11117.113866]  cleanup_mnt+0x3f/0x80
[11117.113868]  __cleanup_mnt+0x12/0x20
[11117.113874]  task_work_run+0x8a/0xb0
[11117.113881]  exit_to_usermode_loop+0xc4/0xd0
[11117.113885]  do_syscall_64+0x100/0x130
[11117.113887]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[11117.113891] RIP: 0033:0x7f0094384487
[11117.113893] RSP: 002b:00007fff4199efc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[11117.113897] RAX: 0000000000000000 RBX: 0000000000944030 RCX: 00007f0094384487
[11117.113899] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000944210
[11117.113900] RBP: 0000000000944210 R08: 0000000000000000 R09: 0000000000000014
[11117.113902] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f009488d83c
[11117.113903] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fff4199f250
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux