--- docs/internals/locking.html.in | 257 ++++++++++++++++++++++++++++++++++++++++ docs/sitemap.html.in | 4 + 2 files changed, 261 insertions(+), 0 deletions(-) create mode 100644 docs/internals/locking.html.in diff --git a/docs/internals/locking.html.in b/docs/internals/locking.html.in new file mode 100644 index 0000000..3790ef0 --- /dev/null +++ b/docs/internals/locking.html.in @@ -0,0 +1,257 @@ +<html> + <body> + <h1>Resource Lock Manager</h1> + + <ul id="toc"></ul> + + <p> + This page describes the design of the resource lock manager + that is used for locking disk images, to ensure exclusive + access to content. + </p> + + <h2><a name="goals">Goals</a></h2> + + <p> + The high level goal is to prevent the same disk image being + used by more than one QEMU instance at a time (unless the + disk is marked as sharable, or readonly). The scenarios + to be prevented are thus: + </p> + + <ol> + <li> + Two different guests running configured to point at the + same disk image. + </li> + <li> + One guest being started more than once on two different + machines due to admin mistake + </li> + <li> + One guest being started more than once on a single machine + due to libvirt driver bug on a single machine. + </li> + </ol> + + <h2><a name="requirement">Requirements</a></h2> + + <p> + The high level goal leads to a set of requirements + for the lock manager design + </p> + + <ol> + <li> + A lock must be held on a disk whenever a QEMU process + has the disk open + </li> + <li> + The lock scheme must allow QEMU to be configured with + readonly, shared write, or exclusive writable disks + </li> + <li> + A lock handover must be performed during the migration + process where 2 QEMU processes will have the same disk + open concurrently. + </li> + <li> + The lock manager must be able to identify and kill the + process accessing the resource if the lock is revoked. + </li> + <li> + Locks can be acquired for arbitrary VM related resources, + as determined by the management application. + </li> + </ol> + + <h2><a name="design">Design</a></h2> + + <p> + Within a lock manager the following series of operations + will need to be supported. + </p> + + <ul> + <li> + <strong>Register object</strong> + Register the identity of an object against which + locks will be acquired + </li> + <li> + <strong>Add resource</strong> + Associate a resource with an object for future + lock acquisition / release + </li> + <li> + <strong>Acquire locks</strong> + Acquire the locks for all resources associated + with the object + </li> + <li> + <strong>Release locks</strong> + Release the locks for all resources associated + with the object + </li> + <li> + <strong>Inquire locks</strong> + Get a representation of the state of the locks + for all resources associated with the object + </li> + </ul> + + <h2><a name="impl">Plugin Implementations</a></h2> + + <p> + Lock manager implementations are provided as LGPLv2+ + licensed, dlopen()able library modules. The plugins + will be loadable from the following location: + </p> + + <pre> +/usr/{lib,lib64}/libvirt/lock_manager/$NAME.so +</pre> + + <p> + The lock manager plugin must export a single ELF + symbol named <code>virLockDriverImpl</code>, which is + a static instance of the <code>virLockDriver</code> + struct. The struct is defined in the header file + </p> + + <pre> + #include <libvirt/plugins/lock_manager.h> + </pre> + + <p> + All callbacks in the struct must be initialized + to non-NULL pointers. The semantics of each + callback are defined in the API docs embedded + in the previously mentioned header file + </p> + + <h2><a name="qemuIntegrate">QEMU Driver integration</a></h2> + + <p> + With the QEMU driver, the lock plugin will be set + in the <code>/etc/libvirt/qemu.conf</code> configuration + file by specifying the lock manager name. + </p> + + <pre> + lockManager="sanlock" + </pre> + + <p> + By default the lock manager will be a 'no op' implementation + for backwards compatibility + </p> + + <h2><a name="usagePatterns">Lock usage patterns</a></h2> + + <p> + The following psuedo code illustrates the common + patterns of operations invoked on the lock + manager plugin callbacks. + </p> + + <h3><a name="usageLockAcquire">Lock acquisition</a></h3> + + <p> + Initial lock acquisition will be performed from the + process that is to own the lock. This is typically + the QEMU child process, in between the fork+exec + pairing. When adding further resources on the fly, + to an existing object holding locks, this will be + done from the libvirtd process. + </p> + + <pre> + virLockManagerParam params[] = { + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID, + .key = "uuid", + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING, + .key = "name", + .value = { .str = dom->def->name }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "id", + .value = { .i = dom->def->id }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "pid", + .value = { .i = dom->pid }, + }, + }; + mgr = virLockManagerNew(lockPlugin, + VIR_LOCK_MANAGER_TYPE_DOMAIN, + ARRAY_CARDINALITY(params), + params, + 0))); + + foreach (initial disks) + virLockManagerAddResource(mgr, + VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK, + $path, 0, NULL, $flags); + + if (virLockManagerAcquire(lock, NULL, 0) < 0); + ...abort... + </pre> + + <h3><a name="usageLockAttach">Lock release</a></h3> + + <p> + The locks are all implicitly released when the process + that acquired them exits, however, a process may + voluntarily give up the lock by running + </p> + + <pre> + char *state = NULL; + virLockManagerParam params[] = { + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UUID, + .key = "uuid", + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_STRING, + .key = "name", + .value = { .str = dom->def->name }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "id", + .value = { .i = dom->def->id }, + }, + { .type = VIR_LOCK_MANAGER_PARAM_TYPE_UINT, + .key = "pid", + .value = { .i = dom->pid }, + }, + }; + mgr = virLockManagerNew(lockPlugin, + VIR_LOCK_MANAGER_TYPE_DOMAIN, + ARRAY_CARDINALITY(params), + params, + 0))); + + foreach (initial disks) + virLockManagerAddResource(mgr, + VIR_LOCK_MANAGER_RESOURCE_TYPE_DISK, + $path, 0, NULL, $flags); + + virLockManagerRelease(mgr, & state, 0); + </pre> + + <p> + The returned state string can be passed to the + <code>virLockManagerAcquire</code> method to + later re-acquire the exact same locks. This + state transfer is commonly used when performing + live migration of virtual machines. By validating + the state the lock manager can ensure no other + VM has re-acquire the same locks on a different + host. The state can also be obtained without + releasing the locks, by calling the + <code>virLockManagerInquire</code> method. + </p> + + </body> +</html> diff --git a/docs/sitemap.html.in b/docs/sitemap.html.in index ad8dc7b..db2963e 100644 --- a/docs/sitemap.html.in +++ b/docs/sitemap.html.in @@ -284,6 +284,10 @@ <a href="internals/command.html">Spawning commands</a> <span>Spawning commands from libvirt driver code</span> </li> + <li> + <a href="internals/locking.html">Lock managers</a> + <span>Use lock managers to protect disk content</span> + </li> </ul> </li> <li> -- 1.7.4.4 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list