On Fri, Jan 22, 2021 at 14:09:52 +0300, Nikolay Shirokovskiy wrote: > On Fri, Jan 22, 2021 at 12:45 PM Michal Privoznik <mprivozn@xxxxxxxxxx> > wrote: > > > If libvirtd is sent SIGTERM while it is still initializing, it > > may crash. The following scenario was observed (using 'stress' to > > slow down CPU so much that the window where the problem exists is > > bigger): > > > > 1) The main thread is already executing virNetDaemonRun() and is > > in virEventRunDefaultImpl(). > > 2) The thread that's supposed to run daemonRunStateInit() is > > spawned already, but daemonRunStateInit() is in its very early > > stage (in the stack trace I see it's executing > > virIdentityGetSystem()). > > > > If SIGTERM (or any other signal that we don't override handler > > for) arrives at this point, the main thread jumps out from > > virEventRunDefaultImpl() and enters virStateShutdownPrepare() > > (via shutdownPrepareCb which was set earlier). This iterates > > through stateShutdownPrepare() callbacks of state drivers and > > reaching qemuStateShutdownPrepare() eventually only to > > dereference qemu_driver. But since thread 2) has not been > > scheduled/not proceeded yet, qemu_driver was not allocated yet. > > > > Solution is simple - just check if qemu_driver is not NULL. But > > doing so only in qemuStateShutdownPrepare() would push the > > problem down to virStateShutdownWait(), well > > qemuStateShutdownWait(). Therefore, duplicate the trick there > > too. > > > > I guess this is a partial solution. Initialization may be in a state when > qemu_driver is initialized but qemu_driver->workerPool is still NULL > for example. > > Maybe we'd better delay shutdown until initialization is finished? The fix may not be ideal or complete, but it definitely fixes one non-race issue (see https://www.redhat.com/archives/libvir-list/2021-January/msg01134.html) Even if we come up with a flag indicating whether a driver was fully initialized or not, we still need to check whether it was initialized at all. Unless we want to introduce yet another static variable in the driver, but I think qemu_driver is already enough. So I guess if you reword the commit message to say this is fixing a case when driver initialization fails (rather than a race), you get my Reviewed-by: Jiri Denemark <jdenemar@xxxxxxxxxx>