[Hotplug_sig] Automated Hotplug CPU Regression Test Cases

markw at osdl.org (Mark Wong) · Thu Jan 6 16:14:52 2005

Hi everyone,

Here's what I started drafting for input, ideas, and contributions in
response to getting a regression test suite going.

Automated Regression Test for Hotplug CPU 

What are the test cases we need?

How do we verify success in each test case?

Test Case 1:
Verify interrupts are moved off of a CPU when offlined through sysfs on
a multiprossor system?

1. Set the IRQ smp_affinity mask for the disk controller to the CPU.
   Echo the appropriate hex mask into /proc/irq/IRQ#/smp_affinity

   Test interrupts from devices other than disk controllers?

2. Start watching the interrupt counts in /proc/interrupts.

   Is it worth verifying tools such as sar at the same time?

   Other statistics to monitor?

3. Start writing to a disk.

   Suggestions for what to do in order to be able to verify all writes
   are completed and correct?

4. Offline a CPU, say cpu1.

   echo 0 > /sys/devices/system/cpu/cpu1/online

   cpu0 is not hotswappable on some architectures.

   Can we pinpoint when a CPU goes offline?  Or when can we know it is
   safe to physically remove a CPU.

   It's my understanding that timeslice overrun prevents
   'time echo 0 > /sys/devices/system/cpu/cpu1/online' from being an
   accurate measure of how long it takes to offline a CPU.

   Does the return of an echo signify the CPU has completed offlining?

5. Do we see any change in /proc/interrupts?
   Any interesting kernel messages in /var/log/messages?

Test Case 2:
Verify running processors are moved off of a CPU when offlined through
sysfs on a multiprossor system?

...