On Thu, Oct 06, 2016 at 12:26:50 +0200, Kashyap Chamarthy wrote: > Backround > --------- > > For QEMU block device jobs, the "ready" boolean field (part of QMP > `query-block-jobs`) was introduced in commit ef6dbf1 (available in QEMU > v2.2.0 or above): > > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ef6dbf1e4 -- > blockjob: Add "ready" field > > "When a block job signals readiness, this is currently reported only > through QMP. If qemu wants to use block jobs for internal tasks, > there needs to be another way to correctly detect when a block job > may be completed. > > For this reason, introduce a bool "ready" which is set when the > block job may be completed." > > > And, libvirt was fixed to use the above field in this commit (available > in libvirt v1.2.18 or above): > > http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=eae5924 -- qemu: > Update state of block job to READY only if it actually is ready [1] > > > RFC > --- > > Currently libvirt block APIs (& consequently higher-level applications > like Nova which use these APIs) rely on polling for job completion via libvirt is _not_ polling the data. Libvirt relies on the events from qemu which are also exposed as libvirt events. > virDomainGetBlockJobInfo(), which uses QMP `query-block-jobs`, and > waits for QEMU to report "offset" == "len", which translates to libvirt > "cur" == "end". Based on this, libvirt can take an action (whether to > gracefully abort, or pivot to the copy in case of a COPY job). Again false. Internally we honor the ready state and applications like virsh have code in place that waits for the async events and act after receiving them. > Since QEMU reports the "ready": true field (followed by a > BLOCK_JOB_READY QMP event). It would be helpful if libvirt expose this > via an API, so upper layers could instead use that, rather than polling. We expose the state of the copy job in the XML and forward the READY event from qemu to the users. A running copy job exposes itself in the xml as: <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/> <backingStore/> <mirror type='file' file='/tmp/ble.img' format='raw' job='copy'> <format type='raw'/> <source file='/tmp/ble.img'/> </mirror> [...] </disk> While the ready copy job is exposed as: <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/systemrescuecd-x86-4.8.0.iso'/> <backingStore/> <mirror type='file' file='/tmp/ble.img' format='raw' job='copy' ready='yes'> <format type='raw'/> <source file='/tmp/ble.img'/> </mirror> [...] </disk> Additionally we have anyncrhronous events that are emitted once qemu notifies us that the block job has reached sync state or finished. Libvirt uses the event to switch to the ready state. The documentation suggests that block jobs should listen to the events and act accordingly only after receiving the event. > Problem scenario > ---------------- > > When virDomainBlockRebase() is invoked to start a copy job, then > aborting the said copy operation with virDomainBlockJobAbort() + flag > VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT can result in a potential race > condition (due to the way the virDomainGetBlockJobInfo() reports the job > status) where the pivot operation fails. Again, this should trigger the block job event and that should be used as the marker to perform the pivot operation. $ virsh event --all --loop event 'block-job-2' for domain hp: Block Copy for hda ready <- after the sync phase is reached event 'block-job-2' for domain hp: Block Copy for hda completed <- after successfully pivoting Applications should act only after receiving such event. > Race condition window > ~~~~~~~~~~~~~~~~~~~~~ > > libvirt finds cur==end AND sends a pivot request, all in the window > before QEMU would have sent "ready": true field [emitted as part of the This is not true. Libvirt checks that the mirror is actually ready. It's done by the commit you've mentioned above. > QMP `query-block-jobs` command's response, indicating that the job has > actually completed], however the pivot request fails because it requires > "ready": true. The problem is that you are polling the block job info which correctly reports that all data was properly copied and you are inferring the block job state from that data. I'm against deliberately reporting false data in the block info structure. The application should register handlers for the block job events and act only if it receives such event. Additionally you may want to check that the state is correct in the XML. The current block job information structure can't be extended unfortunately. Please look at the virsh handling of the block jobs for inspiration since we have example code doing the right thing here. Peter
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list