Le 2021-06-23 14:51, Alexander E. Patrakov a écrit :
вт, 22 июн. 2021 г. в 23:22, Gilles Mocellin
<gilles.mocellin@xxxxxxxxxxxxxx>:
Hello Cephers,
On a capacitive Ceph cluster (13 nodes, 130 OSDs 8To HDD), I'm
migrating a 40
To image from a 3+2 EC pool to a 8+2 one.
The use case is Veeam backup on XFS filesystems, mounted via KRBD.
Backups are running, and I can see 200MB/s Throughput.
But my migration (rbd migrate prepare / execute) is staling at 4% for
6h now.
When the backups are not running, I can see a little 20MB/s of
throughput,
certainly my migration.
[...]
I suggest that you cancel the migration and don't ever attempt it
again because big EC setups are very easy to overload with IOPS.
When I worked at croit GmbH, we had a very unhappy customer with
almost the same setup as you are trying to achieve: Veeam Backup, XFS
on rbd on a 8+3 EC pool of HDDs. Their complaint was that both the
backup and restore were extremely slow, ~3 MB/s, and with 200 ms of
latency, but I would call their cluster overloaded due to too many
concurrent backups. We tried, unsuccessfully, to tune their setup, but
our final recommendation (successfully benchmarked but rejected due to
costs) was to create a separate replica 3 pool for new backups.
Argh...
The load was not terrible. And when there wasn't any backup, it was very
low, but the migration really seems blocked.
And now, more strange, the abort doesn't finish. The image status has
become unknown :
root -> rbd status veeam-repos/veeam-repo4-vol2
Watchers:
watcher=100.99.103.54:0/1497373484 client.5113986
cookie=139751457839168
Migration:
source: veeam-repos/veeam-repo4-vol2 (17e5e3267adad3)
destination: veeam-repos/veeam-repo4-vol2 (4debed1a2ed31b)
state: unknown
The connected client is the rbd abort command.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx