On Thu, Mar 20, 2025 at 02:40:10AM +0000, Pasha Tatashin wrote: > Introduce a new subsystem within the driver core to enable keeping > devices alive during kernel live update. This infrastructure is > designed to be registered with and driven by a separate Live Update > Orchestrator, allowing the LUO's state machine to manage the save and > restore process of device state during a kernel transition. > > The goal is to allow drivers and buses to participate in a coordinated > save and restore process orchestrated by a live update mechanism. By > saving device state before the kernel switch and restoring it > immediately after, the device can appear to remain continuously > operational from the perspective of the system and userspace. > > components introduced: > > - `struct dev_liveupdate`: Embedded in `struct device` to track the > device's participation and state during a live update, including > request status, preservation status, and dependency depth. > > - `liveupdate()` callback: Added to `struct bus_type` and > `struct device_driver`. This callback receives an enum > `liveupdate_event` to manage device state at different stages of the > live update process: > - LIVEUPDATE_PREPARE: Save device state before the kernel switch. > - LIVEUPDATE_REBOOT: Final actions just before the kernel jump. > - LIVEUPDATE_FINISH: Clean-up after live update. > - LIVEUPDATE_CANCEL: Clean up any saved state if the update is > aborted. > > - Sysfs attribute "liveupdate/requested": Added under each device > directory, allowing user to request that a specific device to > participate in live update. I.e. its state is to be preserved > during the update. As you can imagine, I have "thoughts" about all of this being added to the driver core. But, before I go off on that, I want to see some real, actual, working, patches for at least 3 bus subsystems that correctly implement this before I even consider reviewing this. Show us real users please, otherwise any attempt at reviewing this is going to just be a waste of our time as I have doubts that this actually even works :) Also, as you are adding a new user/kernel api, please also point at the userspace tools that are written to handle all of this. As you are going to be handling potentially tens of thousands of devices from userspace this way, in a single system, real code is needed to even consider that this is an acceptable solution. thanks, greg k-h