On Mon, 2005-08-29 at 22:24 -0700, Zachary Amsden wrote: > Silbermann, Martine wrote: > > >The latest draft of the Use Case on Hotplug for Virtualization is posted > >at: > >http://www.developer.osdl.org/maryedie/HOTPLUG/docs/Hotplug_virtual_use_ > >case.txt > > > >It can also be accessed from the hotplug use cases page: > >http://developer.osdl.org/dev/usecases/hotplug.shtml > > > >Please remember to share your comments/errata/suggestions. > > > >Thanks - Martine > > > > > > > > >------------------------- > >1.Serviceability (hotplug at physical layer) > >------------------------- > >In this sub-case the System Administrator needs the ability to > >remove/replace failing components. > >Unfortunately, CPU failures tend to be fatal and usually don't give > >any warning. Fortunately, they're also very infrequent. Because they > >are usually fatal, it's likely that you won't be looking at a > >hot-remove scenario.(Though, if you have a processor failure and > >remove it while the system is down, the System Administrator needs > >the option to reboot immediately and hot-add the replacement). In > >contrast, memory and I/O often give adequate warning, via single-bit > >or parity errors, that they're failing; thus providing an opportunity > >to have them replaced before they cause a system failure. > > > I'm not sure I totally agree with this. I like the capacity, migration, > and virtualization motivations a lot. First, CPU failures are not > always fatal; in a NUMA, blade or block systems, CPU failures could > potentially be tolerated for single nodes without disrupting the entire Good point, I'll correct that. > system. Second, I'm not 100% convinced that memory failures are any > more predictable based on parity errors than CPU failures - it could > simply be solar radiation. I'm also less convinced that memory is more > easily replaced in a hotplug fashion than CPUs - you have similar > migration issues, plus a lot of nasty global TLB issues (not to mention > electrical problems!). I confess to be not thoroughly studied on very Yes the detection of a potential problem for memory may be easier but I didn't mean to imply that doing the hotplug of memory is. I'll make sure to reword that. > recent systems in these respects. Nevertheless, your point is valid; I > simply think it could use more convincing arguments. > > That said, I think everyone on these lists probably agrees with the > usage cases. :) > Thanks so much for your input, I'll post the changed version soon. > Zach