Could anyone run the tests? and share some results..
Thanks in advance,
Best,
German
2017-11-30 14:25 GMT-03:00 German Anders <ganders@xxxxxxxxxxxx>:
That's correct, IPoIB for the backend (already configured the irq affinity), and 10GbE on the frontend. I would love to try rdma but like you said is not stable for production, so I think I'll have to wait for that. Yeah, the thing is that it's not my decision to go for 50GbE or 100GbE... :( so.. 10GbE for the front-end will be...Would be really helpful if someone could run the following sysbench test on a mysql db so I could make some compares:my.cnf configuration file:[mysqld_safe]nice = 0pid-file = /home/test_db/mysql/mysql.pid[client]port = 33033socket = /home/test_db/mysql/mysql.sock[mysqld]user = test_dbport = 33033socket = /home/test_db/mysql/mysql.sockpid-file = /home/test_db/mysql/mysql.pidlog-error = /home/test_db/mysql/mysql.errdatadir = /home/test_db/mysql/datatmpdir = /tmpserver-id = 1# ** Binlogging **#log-bin = /home/test_db/mysql/binlog/mysql-bin #log_bin_index = /home/test_db/mysql/binlog/mysql-bin.index expire_logs_days = 1max_binlog_size = 512MBthread_handling = pool-of-threadsthread_pool_max_threads = 300# ** Slow query log **slow_query_log = 1slow_query_log_file = /home/test_db/mysql/mysql-slow.log long_query_time = 10log_output = FILElog_slow_slave_statements = 1log_slow_verbosity = query_plan,innodb,explain# ** INNODB Specific options **transaction_isolation = READ-COMMITTEDinnodb_buffer_pool_size = 12Ginnodb_data_file_path = ibdata1:256M:autoextendinnodb_thread_concurrency = 16innodb_log_file_size = 256Minnodb_log_files_in_group = 3innodb_file_per_tableinnodb_log_buffer_size = 16Minnodb_stats_on_metadata = 0innodb_lock_wait_timeout = 30# innodb_flush_method = O_DSYNCinnodb_flush_method = O_DIRECTmax_connections = 10000max_connect_errors = 999999max_allowed_packet = 128Mskip-host-cacheskip-name-resolveexplicit_defaults_for_timestamp = 1 performance_schema = OFFlog_warnings = 2event_scheduler = ON# ** Specific Galera Cluster Settings **binlog_format = ROWdefault-storage-engine = innodbquery_cache_size = 0query_cache_type = 0Volume is just an RBD (on a RF=3 pool) with the default 22 bit order mounted on /home/test_db/mysql/datacommands for the test:sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/ parallel_prepare.lua --mysql-host=<hostname> --mysql-port=33033 --mysql-user=sysbench --mysql-password=sysbench --mysql-db=sysbench --mysql-table-engine=innodb --db-driver=mysql --oltp_tables_count=10 --oltp-test-mode=complex --oltp-read-_only_=off --oltp-table-size=200000 --threads=10 --rand-type=uniform --rand-init=on cleanup > /dev/null 2>/dev/null sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/ <hostname> --parallel_prepare.lua --mysql-host= mysql-port=33033 --mysql-user=sysbench --mysql-password=sysbench --mysql-db=sysbench --mysql-table-engine=innodb --db-driver=mysql --oltp_tables_count=10 --oltp-test-mode=complex --oltp-read-_only_=off --oltp-table-size=200000 --threads=10 --rand-type=uniform --rand-init=on prepare > /dev/null 2>/dev/null sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/ <hostname> --oltp.lua --mysql-host= mysql-port=33033 --mysql-user=sysbench --mysql-password=sysbench --mysql-db=sysbench --mysql-table-engine=innodb --db-driver=mysql --oltp_tables_count=10 --oltp-test-mode=complex --oltp-read-_only_=off --oltp-table-size=200000 --threads=20 --rand-type=uniform --rand-init=on --time=120 run > result_sysbench_perf_test.out 2>/dev/null Im looking for tps, qps and 95th perc, could anyone with a all-nvme cluster run the test and share the results? I would really appreciate the help :)Thanks in advance,Best,German
2017-11-29 19:14 GMT-03:00 Zoltan Arnold Nagy <zoltan@xxxxxxxxxxxxxxxxxx>:On 2017-11-27 14:02, German Anders wrote:
4x 2U servers:so I assume you are using IPoIB as the cluster network for the replication...
1x 82599ES 10-Gigabit SFI/SFP+ Network Connection
1x Mellanox ConnectX-3 InfiniBand FDR 56Gb/s Adapter (dual port)
1x OneConnect 10Gb NIC (quad-port) - in a bond configuration... and the 10GbE network for the front-end network?
(active/active) with 3 vlans
At 4k writes your network latency will be very high (see the flame graphs at the Intel NVMe presentation from the Boston OpenStack Summit - not sure if there is a newer deck that somebody could link ;)) and the time will be spent in the kernel. You could give RDMAMessenger a try but it's not stable at the current LTS release.
If I were you I'd be looking at 100GbE - we've recently pulled in a bunch of 100GbE links and it's been wonderful to see 100+GB/s going over the network for just storage.
Some people suggested mounting multiple RBD volumes - unless I'm mistaken and you're using very recent qemu/libvirt combinations with the proper libvirt disk settings all IO will still be single threaded towards librbd thus not making any speedup.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com