Hello together, i need some help on our ceph 16.2.5 cluster as iscsi target with esxi nodes background infos: - we have build 3x osd nodes with 60 bluestore osd with and 60x6TB spinning disks, 12 ssd´s and 3nvme. - osd nodes have 32cores and 256gb Ram - the osd disk are connected to a scsi raid controller ... each disk is configured as raid0 and with write back enabled to use the raid controller cache etc. - we have 3x mons and 2x iscsi gateways - all servers are connected on a 10Gbit network (switches) - all servers have two 10gbit network adapter configured as bond-rr - we created one rbd pool with autoscaling and 128pg (at the moment) - in the pool are at the moment 5 rbd images... 2x 10tb and 3x500gb with feature exlusic lock and striping v2 (4mb obj / 1mb stipe / count 4) - All the images are attached to the two iscsi gateays running tcmu-runner 1.5.4 and exposed as iscsi target - we have 6 esxi 6.7u3 servers as computed node connected to the ceph iscsi target esxi iscsi config: esxcli system settings advanced set -o /ISCSI/MaxIoSizeKB -i 512 esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=64 esxcli system module parameters set -m iscsi_vmk -p iscsivmk_HostQDepth=64 esxcli system settings advanced set --int-value 1 --option /DataMover/HardwareAcceleratedMove the osd nodes, mons, rgw/iscsi gateways and esxi nodes are all connected to the 10gbit network with bond-rr rbd benchmark test: root@cd133-ceph-osdh-01:~# rados bench -p rbd 10 write hints = 1 Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects Object prefix: benchmark_data_cd133-ceph-osdh-01_87894 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 16 69 53 211.987 212 0.250578 0.249261 2 16 129 113 225.976 240 0.296519 0.266439 3 16 183 167 222.641 216 0.219422 0.273838 4 16 237 221 220.974 216 0.469045 0.28091 5 16 292 276 220.773 220 0.249321 0.27565 6 16 339 323 215.307 188 0.205553 0.28624 7 16 390 374 213.688 204 0.188404 0.290426 8 16 457 441 220.472 268 0.181254 0.286525 9 16 509 493 219.083 208 0.250538 0.286832 10 16 568 552 220.772 236 0.307829 0.286076 Total time run: 10.2833 Total writes made: 568 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 220.941 Stddev Bandwidth: 22.295 Max bandwidth (MB/sec): 268 Min bandwidth (MB/sec): 188 Average IOPS: 55 Stddev IOPS: 5.57375 Max IOPS: 67 Min IOPS: 47 Average Latency(s): 0.285903 Stddev Latency(s): 0.115162 Max latency(s): 0.88187 Min latency(s): 0.119276 Cleaning up (deleting benchmark objects) Removed 568 objects Clean up completed and total clean up time :3.18627 the rbd benchmark says that min 250 mb/s is possible... but i saw realy much more... up to 550mb/s if i start iftop on one osd node i see the ceph iscsi gw names as rgw and the traffic is nearly 80mb/s [image: grafik] <https://user-images.githubusercontent.com/54031716/134509089-2c218b23-7460-4cdb-b54a-e660c91d599e.png> the ceph dashboard shows that the write iscsi performance are only 40mb/s the max value i saw was between 40 and 60mb/s.. very poor [image: grafik] <https://user-images.githubusercontent.com/54031716/134509280-17c6b4b1-d740-43c9-9b8b-bb77333357a0.png> if i look into the vcenter and esxi datastore performance i see very high storage device latencys between 50 and 100ms... very bad [image: grafik] <https://user-images.githubusercontent.com/54031716/134509746-c9971592-4129-4f27-a36b-25d50035d437.png> root@cd133-ceph-mon-01:/home/cephadm# ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic container_image docker.io/ceph/ceph@sha256:829ebf54704f2d827de00913b171e5da741aad9b53c1f35ad59251524790eceb * global advanced journal_max_write_bytes 1073714824 global advanced journal_max_write_entries 10000 global advanced mon_osd_cache_size 1024 global dev osd_client_watch_timeout 15 global dev osd_heartbeat_interval 5 global advanced osd_map_cache_size 128 global advanced osd_max_write_size 512 global advanced rados_osd_op_timeout 5 global advanced rbd_cache_max_dirty 134217728 global advanced rbd_cache_max_dirty_age 5.000000 global advanced rbd_cache_size 268435456 global advanced rbd_op_threads 2 mon advanced auth_allow_insecure_global_id_reclaim false mon advanced cluster_network 10.50.50.0/24 * mon advanced public_network 10.50.50.0/24 * mgr advanced mgr/cephadm/container_init True * mgr advanced mgr/cephadm/device_enhanced_scan true * mgr advanced mgr/cephadm/migration_current 2 * mgr advanced mgr/cephadm/warn_on_stray_daemons false * mgr advanced mgr/cephadm/warn_on_stray_hosts false * mgr advanced mgr/dashboard/10.50.50.21/server_addr * mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://10.221.133.161:9093 * mgr advanced mgr/dashboard/GRAFANA_API_SSL_VERIFY false * mgr advanced mgr/dashboard/GRAFANA_API_URL https://10.221.133.161:3000 * mgr advanced mgr/dashboard/ISCSI_API_SSL_VERIFICATION true * mgr advanced mgr/dashboard/NAME/server_port 80 * mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://10.221.133.161:9095 * mgr advanced mgr/dashboard/PROMETHEUS_API_SSL_VERIFY false * mgr advanced mgr/dashboard/RGW_API_ACCESS_KEY W8VEKVFDK1RH5IH2Q3GN * mgr advanced mgr/dashboard/RGW_API_SECRET_KEY IkIjmjfh3bMLrPOlAFbMfpigSIALAQoKGEHzZgxv * mgr advanced mgr/dashboard/camdatadash/server_addr 10.251.133.161 * mgr advanced mgr/dashboard/camdatadash/ssl_server_port 8443 * mgr advanced mgr/dashboard/cd133-ceph-mon-01/server_addr * mgr advanced mgr/dashboard/dasboard/server_port 80 * mgr advanced mgr/dashboard/dashboard/server_addr 10.251.133.161 * mgr advanced mgr/dashboard/dashboard/ssl_server_port 8443 * mgr advanced mgr/dashboard/server_addr 0.0.0.0 * mgr advanced mgr/dashboard/server_port 8080 * mgr advanced mgr/dashboard/ssl false * mgr advanced mgr/dashboard/ssl_server_port 8443 * mgr advanced mgr/orchestrator/orchestrator cephadm mgr advanced mgr/prometheus/server_addr 0.0.0.0 * mgr advanced mgr/telemetry/channel_ident true * mgr advanced mgr/telemetry/contact hf@xxxxx * mgr advanced mgr/telemetry/description ceph cluster * mgr advanced mgr/telemetry/enabled true * mgr advanced mgr/telemetry/last_opt_revision 3 * osd dev bluestore_cache_autotune false osd class:ssd dev bluestore_cache_autotune false osd dev bluestore_cache_size 4000000000 osd class:ssd dev bluestore_cache_size 4000000000 osd dev bluestore_cache_size_hdd 4000000000 osd dev bluestore_cache_size_ssd 4000000000 osd class:ssd dev bluestore_cache_size_ssd 4000000000 osd advanced bluestore_default_buffered_write true osd class:ssd advanced bluestore_default_buffered_write true osd advanced osd_max_backfills 1 osd class:ssd dev osd_memory_cache_min 4000000000 osd class:hdd basic osd_memory_target 6000000000 osd class:ssd basic osd_memory_target 6000000000 osd advanced osd_recovery_max_active 3 osd advanced osd_recovery_max_single_start 1 osd advanced osd_recovery_sleep 0.000000 client.rgw.ceph-rgw.cd133-ceph-rgw-01.klvrwk basic rgw_frontends beast port=8000 * client.rgw.ceph-rgw.cd133-ceph-rgw-01.ptmqcm basic rgw_frontends beast port=8001 * client.rgw.ceph-rgw.cd88-ceph-rgw-01.czajah basic rgw_frontends beast port=8000 * client.rgw.ceph-rgw.cd88-ceph-rgw-01.pdknfg basic rgw_frontends beast port=8000 * client.rgw.ceph-rgw.cd88-ceph-rgw-01.qkdlfl basic rgw_frontends beast port=8001 * client.rgw.ceph-rgw.cd88-ceph-rgw-01.tdsxpb basic rgw_frontends beast port=8001 * client.rgw.ceph-rgw.cd88-ceph-rgw-01.xnadfr basic rgw_frontends beast port=8001 * can somebody explain me what i am doing wrong or what can i do to get a better performance with ceph-iscsi? doesnt matter what i do or what i tweak the write performance will not get better. i already experimented with gwcli and the iscsi queue and other settings. actually i set: hw_max_sectors 8192 max_data_area_mb 32 cmdsn_depth 64 / the esxi nodes are alredy set fixed to 64 max iscsi commands everything is fine and multipathing is workind and the recovery is fast ... but the iscsi very slow and i dont know why. can somebody help me maybe? _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx