Hi everyone, Here's what I started drafting for input, ideas, and contributions in response to getting a regression test suite going. Automated Regression Test for Hotplug CPU What are the test cases we need? How do we verify success in each test case? Test Case 1: Verify interrupts are moved off of a CPU when offlined through sysfs on a multiprossor system? 1. Set the IRQ smp_affinity mask for the disk controller to the CPU. Echo the appropriate hex mask into /proc/irq/IRQ#/smp_affinity Test interrupts from devices other than disk controllers? 2. Start watching the interrupt counts in /proc/interrupts. Is it worth verifying tools such as sar at the same time? Other statistics to monitor? 3. Start writing to a disk. Suggestions for what to do in order to be able to verify all writes are completed and correct? 4. Offline a CPU, say cpu1. echo 0 > /sys/devices/system/cpu/cpu1/online cpu0 is not hotswappable on some architectures. Can we pinpoint when a CPU goes offline? Or when can we know it is safe to physically remove a CPU. It's my understanding that timeslice overrun prevents 'time echo 0 > /sys/devices/system/cpu/cpu1/online' from being an accurate measure of how long it takes to offline a CPU. Does the return of an echo signify the CPU has completed offlining? 5. Do we see any change in /proc/interrupts? Any interesting kernel messages in /var/log/messages? Test Case 2: Verify running processors are moved off of a CPU when offlined through sysfs on a multiprossor system? ...