>> "remaining vmcore is zeroed that it is bad and not acceptable for kdump." >> >> Which scenario are you concerned about? User space plays stupid games >> (unbining a driver from a virtio-mem device in a *kdump kernel* after >> opening /proc/vmcore) and wins stupid prices (a warning and a vmcore >> filled (partially) with zeroes). Why isn't a warning sufficient for >> something like that? > > Hi David, > > Suppose we have the use case below: > Hi Dave, thanks for elaborating, it helps a lot to understand your concerns. > A user plays with the game (Probably in hypervisor part, but the user is > not aware that the guest panicked and in a kdump kernel), then we get a > zeroed vmcore. But the panic can not be easily reproduced any more, > then the warning is not useful. I can only speak about virtio-mem (well, that's the only current known "dynamic vmcore_cb registration" user :) ). virtio-mem devices cannot get hotunplugged in the hypervisor (i.e., QEMU)-- you can only hot(un)plug device memory, but not the device itself, it will stick around. Hotunplugging the device is completely blocked and not supported. The reason is simple: unplugging a virtio-mem device will also remove the device memory. It's similar to other memory devices, such as DIMMs -- I would not recommend forced, physical removal of a DIMM to anybody -- not while the OS is running and not while kdump is saving /proc/vmcore. Which is also the reason why hypervisors don't generally support forced removal of such devices. :) So for the currently known vmcore_cb users, hypervisor action cannot result in driver unbinding and consequently vmcore_cb changes. Note: virtio-mem-pci devices might eventually get hotplugged while kdump is active. I assume we don't disable PCI hotplug in kdump kernels. While this will trigger a warning ("Unexpected vmcore callback registration"), the vmcore will not be affected and be complete. > > But if you think user is playing the game in kdump kernel, eg. in guest > os while kdump is saving vmcore then it is nearly not possible to happen > I agree with you it is a very trival problem. Yes, that's the only thing I consider can happen. For example, doing a: # echo 1 > /sys/devices/pci0000\:00/0000\:00\:03.0/remove in a kdump kernel after opening the vmcore. > > Probably we have some misunderstanding, but it would be good to make it > clear :) Understanding your concern, it could be future proof (for future vmcore_cb users?) to fail the ioctls instead of returning 0. But even for new memory devices, unplug is usually something to be fenced off by the hypervisor, just like not allowing forced DIMM removal. The only think I could imagine is having e.g., virtio-balloon device register a vmcore_cb dynamically and providing a new mechanism to query if a page is backed by a real page in the hypervisor (similar to XENs hypercall). Such a device could be unplugged without harm, as it doesn't actually provide device memory. -- Thanks, David / dhildenb