Benjamin Marzinski wrote:
Are you exporting the gnbds in clustered or unclustered mode (with the -c
option or not)? In uncached, you should be able to run "gnbd_import -Or <gnbd>"
It wont actually remove the device if it is opened, but it should cause
all the pending IOs to fail.
Hi Ben - I am running uncached (NOT using -c)
In uncached mode, after your timeout, all the
IOs should get flushed assuming that gnbd can fence the server.
If this
isn't happening, can you please send me a more complete description of your
gnbd setup and problem, including the result of following set of commands, run
after the server node fails.
We have a high-availability server pair, and need a common storage pool
and require no single point of failure, so we don't like any of the SAN
approaches.
Instead we have created an md device (raid level 1), with one local and
one gndb imported device. We are not using multipath, instead each
server has two bound network devices, connected to different hubs, with
one
The gnbd import and raid mount is managed as a cluster service, is
failover between either node, and is only ever running on one node at
any given moment in time.
We are using DLM locking, and fenced with proprietary fence (although
fencing is not a vital part of data integrity in our schema)
Failure of the active node is handled well, the services migrate, the
remaining node mounts a (degraded) md device from it's local disk, and
cluster operation is maintained.
When the dead node returns, a custom script hot-adds the returning
gnbd_imported disk, and the md device recovers.
The problem comes when the STANDBY node fails. The md device does not
take well to failure of the imported gnbd device. On failure of the
standby node, the md mount on the active node just hangs.
So at the moment we have had to write a custom script that checks for
failure of the node from which gnbd_services are imported.
On detection of failure, our script has to manually fail the md device
(mdadm --fail), at which point the md devices unfreezes. Our script
then hot-removes the device from the md array, for completeness.
I had thought that the gnbd_recvd should have hooked up with CMAN/Magma
and that the device imported from the failed node should automatically
fail.
Regards,
James
--
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster