From: Dexuan Cui <decui@xxxxxxxxxxxxx> Sent: Saturday, January 25, 2020 9:50 PM > > The state machine in the hv_utils driver can run out of order in some > corner cases, e.g. if the kvp daemon doesn't call write() fast enough > due to some reason, kvp_timeout_func() can run first and move the state > to HVUTIL_READY; next, when kvp_on_msg() is called it returns -EINVAL > since kvp_transaction.state is smaller than HVUTIL_USERSPACE_REQ; later, > the daemon's write() gets an error -EINVAL, and the daemon will exit(). > > We can reproduce the issue by sending a SIGSTOP signal to the daemon, wait > for 1 minute, and send a SIGCONT signal to the daemon: the daemon will > exit() quickly. > > We can fix the issue by forcing a reset of the device (which means the > daemon can close() and open() the device again) and doing extra necessary > clean-up. > > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > > --- > Changes in v2: > This is actually a new patch that makes the daemons more robust. > > Changes in v3 (I addressed Michael's comments): > Don't reset target_fd, since that's unnecessary. > Reset target_fname by: target_fname[0] = '\0'; > Added the missing "fs_frozen = true;" in vss_operate(). > Just after reopen_vss_fd: if vss_operate(VSS_OP_THAW) can not clear > fs_frozen due to an error, we just exit(). > Added comments. > > Changes in v4 (Thanks to Michael!): > Added the omitted "int fcopy_fd = -1" and > " > if (fcopy_fd != -1) > close(fcopy_fd); > " > > tools/hv/hv_fcopy_daemon.c | 37 ++++++++++++++++++++++++---- > tools/hv/hv_kvp_daemon.c | 36 ++++++++++++++++------------ > tools/hv/hv_vss_daemon.c | 49 +++++++++++++++++++++++++++++--------- > 3 files changed, 91 insertions(+), 31 deletions(-) > Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>