Hi!,
I'm having a problem with a new ceph deployment using rbd mirroring and it's just in case someone can help me out or point me in the right direction.
I have a ceph jewel install, with 2 clusters(zone1,zone2), rbd is working fine, but the rbd mirroring between sites is not working correctly.
I have configured pool replication in the default rbd pool, I have setup the peers and created 2 test images:
[root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool info
Mode: pool
Peers:
UUID NAME CLIENT
397b37ef-8300-4dd3-a637-2a03c3b9289c zone2 client.zone2
[root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool info
Mode: pool
Peers:
UUID NAME CLIENT
2c11f1dc-67a4-43f1-be33-b785f1f6b366 zone1 client.zone1
Primary is ok:
[root@mon3 ceph]# rbd --user zone1 --cluster zone1 mirror pool status --verbose
health: OK
images: 2 total
2 stopped
test-2:
global_id: 511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
state: up+stopped
description: remote image is non-primary or local image is primary
last_update: 2017-03-16 17:38:08
And secondary is always in this state:
[root@mon3 ceph]# rbd --user zone2 --cluster zone2 mirror pool status --verbose
health: WARN
images: 2 total
1 syncing
test-2:
global_id: 511e3aa4-0e24-42b4-9c2e-8d84fc9f48f4
state: up+syncing
description: bootstrapping, OPEN_LOCAL_IMAGE
last_update: 2017-03-16 17:41:02
Sometimes for a couple of seconds it goes into replay state and health ok, but then back to bootstrapping, OPEN_LOCAL_IMAGE. what does this state mean?.
In the log files I have this error:
2017-03-16 17:43:02.404372 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:03.411327 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:04.420074 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:05.422253 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system
2017-03-16 17:43:06.428447 7ff6262e7700 -1 librbd::ImageWatcher: 0x7ff654003190 error requesting lock: (30) Read-only file system
Not sure to what file it refers that is RO, I have tried to strace it, but couldn't find it.
I have disable selinux just in case but the result is the same the OS is rhel 7.2 by the way.
If a do a demote/promote of the image, I get the same state and errors on the other cluster.
If someone could help it would be great.
Thnx in advance.
Regards
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com