Hello 😊,
Thank you very much for your response.
I give you more information.
I do not have MS VM. I only have Debians and Ubuntu VMs.
I have a *Proxmox Cluster* with *6 hosts*. The network setup is as follows:
* *10G link* for Ceph Cluster
* *10G link* for Ceph public
* *1G link* for Corosync
* *1G IPMI*
* *10G link* for VMs
Each host has *2 or 3 OSDs (15TB NVMe)*. The hosts are *heterogeneous*,
but all have *512GB RAM*.
I do not observe any bottlenecks in *htop or iftop*, and *iostat*
reports only *0.12% iowait*. However, *fio* test results are concerning.
Here is the *fio* command I used:
fio --name=registry-read --ioengine=libaio --rw=randread --bs=4k
--numjobs=4 --size=1G --runtime=60 --group_reporting
registry-read: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B,
(T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.33
Starting 4 processes
registry-read: Laying out IO file (1 file / 1024MiB)
registry-read: Laying out IO file (1 file / 1024MiB)
registry-read: Laying out IO file (1 file / 1024MiB)
registry-read: Laying out IO file (1 file / 1024MiB)
Jobs: 4 (f=4): [r(4)][100.0%][r=39.8MiB/s][r=10.2k IOPS][eta 00m:00s]
registry-read: (groupid=0, jobs=4): err= 0: pid=231332: Sun Mar 16
22:30:24 2025
read: IOPS=10.2k, BW=39.7MiB/s (41.7MB/s)(2385MiB/60001msec)
slat (usec): min=194, max=13111, avg=390.63, stdev=80.29
clat (nsec): min=910, max=190362, avg=1521.76, stdev=873.64
lat (usec): min=195, max=13114, avg=392.15, stdev=80.35
clat percentiles (nsec):
| 1.00th=[ 1112], 5.00th=[ 1208], 10.00th=[ 1224], 20.00th=[ 1272],
| 30.00th=[ 1288], 40.00th=[ 1320], 50.00th=[ 1352], 60.00th=[ 1400],
| 70.00th=[ 1496], 80.00th=[ 1704], 90.00th=[ 1960], 95.00th=[ 2224],
| 99.00th=[ 2832], 99.50th=[ 3856], 99.90th=[12096], 99.95th=[16768],
| 99.99th=[26240]
bw ( KiB/s): min=31984, max=43288, per=100.00%, avg=40730.22,
stdev=381.52, samples=476
iops : min= 7996, max=10822, avg=10182.55, stdev=95.38,
samples=476
lat (nsec) : 1000=0.02%
lat (usec) : 2=91.02%, 4=8.48%, 10=0.32%, 20=0.12%, 50=0.03%
lat (usec) : 100=0.01%, 250=0.01%
cpu : usr=0.80%, sys=5.99%, ctx=610640, majf=0, minf=47
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=610483,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=39.7MiB/s (41.7MB/s), 39.7MiB/s-39.7MiB/s
(41.7MB/s-41.7MB/s), io=2385MiB (2501MB), run=60001-60001msec
Summary:
*Test Results:*
* *IOPS:* 10.2k
* *Bandwidth:* 39.7MiB/s (41.7MB/s)
* *Latency:*
o Avg: *392µs*
o 99.9th percentile: *12ms*
* *CPU Usage:* usr=0.80%, sys=5.99%
Kind regards,
Gio
Am 11.03.2025 um 11:55 schrieb Giovanna Ratini:
Hello everyone,
We are running Ceph in Proxmox with a 10G network.
Unfortunately, we are experiencing very low read rates. I will try to
implement the solution recommended in the Proxmox forum. However, even
80 MB per second with an NVMe drive is quite disappointing.
Forum link
<https://forum.proxmox.com/threads/slow-performance-on-ceph-per-vm.151223/#post-685070>
For this reason, we are considering purchasing a 100G switch for our
servers.
This raises some questions:
Should I still use separate networks for VMs and Ceph with 100G?
I have read that running Ceph on bridged connections is not recommended.
Does anyone have experience with 100G Ceph in Proxmox?
Is upgrading to 100G a good idea, or will I have 60G sitting idle?
Thanks in advance!
Gio
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx