[Hotplug_sig] Test case documentation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've revised the pseudocode to make it match with what was coded and to
make it a bit more readible.  I haven't done case 5 since that's not
been finished.

These will be included in the LHCS Regression Test Suite package.

Attached below...

Bryce



Testcase 01
-----------

This test attempts to verify that when a CPU is offlined, that a process
writing to disk doesn't cause an issue.  We create a process that writes
to disk, force it to run only on a specified CPU by setting its CPU
affinity to just that CPU, then offline that CPU, and verify that the
process moves to another processor properly.


Notes
=====

There are two kinds of masks:  One to specify which CPU's are allowed
to be used for the given process, and one for the smp affinity.

This may be hard to verify but we can indirectly check on this
by looking at /proc/stat or measuring the relative performance
of some parallelized benchmark before and after onlining the CPU.


Algorithm
=========
Given a CPU to test that exists

Take a snapshot of what CPUs are on and off initially

Make sure the cpu is online

Start up a process that writes to disk

Loop until done:
  Take a snapshot of /proc/interrupts

  Foreach CPU in the system
    online the CPU
    migrate the IRQs to it
    sleep a little while

  Foreach CPU in the system
    migrate IRQs onto the CPU
    offline the cpu
    sleep a little while

  Take another snapshot of /proc/interrupts

  Print a report showing the change in IRQs


When exiting:
  Kill the write loop process

  Restore all CPUs to their initial state


Testcase 02
-----------

This test checks that a process migrates when the CPU it is running on
is offlined.  


Algorithm
=========
Given a CPU to test that exists

Make sure the cpu is online

Start a process that just uses processor cycles

Loop until done:
  Move the process to the CPU we will be offlining

  Offline the CPU

  Determine which CPU the process migrated to

  Verify that it is still running

  Verify that it is not running on the original CPU

  Turn the CPU back online



When exiting:
  Kill the spin loop process



Testcase 03
-----------

This test verifies that when you online a new CPU, that the scheduler
takes advantage of it by shifting some of its workloads onto it.  We do
this by offlining a CPU, creating a bunch of processor intensive
processes, and then onlining the CPU, and checking to make sure at least
one of the processes moved to that CPU.


Algorithm
=========
Given a CPU to test that exists

Take a snapshot of what CPUs are on and off initially

Loop until done:
  Online all of the CPUs and note their state

  Offline the specified CPU

  Start up a number of processes equal to twice the number of CPUs we
  have, so we can be pretty sure that we've got enough processes that at
  least one will migrate to the new CPU.
  
  Now online the specified CPU

  Wait a few seconds, to allow the process scheduler to move processes
  around a bit.

  Verify that at least one process has migrated to the new CPU by
  looking at the output from 'ps -o psr -o com' and searching for our
  CPU running the process.


When exiting:
  Kill all of the load processes

  Restore all CPUs to their initial state



Testcase 04
-----------

This test verifies that we can't offline ALL of the CPUs in the system.
We do this by onlining all the cpus, then offlining all the cpus and
verifying that an error is returned for the last one.

Algorithm
=========
Loop until done:
  Take a snapshot of what CPUs are on and off initially

  Online all the CPUs

  Offline al the CPUs

  Restore system to initial state




Testcase 06
-----------

It's been found that sometimes onlining and offlining CPUs confuse some
of the various system tools.  In particular, we found it caused top to
crash, and found that sar wouldn't register newly available cpus that
weren't there when it started.  This test case seeks to exercise these
known error cases and verify that they behave correctly now.


Algorithm - Top
===============
Given a CPU to test that exists

Make sure the specified cpu is online

Loop until done:
  Start up top and give it a little time to run

  Offline the specified CPU

  Wait a little time for top to notice the CPU is gone

  Now check that top hasn't crashed by verifying its PID is still 
  being reported by ps.

When exiting:
  Kill the top process
  Restore all CPUs to their initial state


Algorithm - Sar
===============
Given a CPU to test that exists

Make sure the specified cpu is offline

Loop until done:
  Start up sar writing to a temp log and give it a little time to run

  Verify that SAR has correctly listed the missing CPU as 'nan' in its
  tmp log 

  Take a timestamp and count how many CPUs sar is reporting to be
  offline

  Online the specified cpu

  Take another timestamp and another count of offlined CPUs.

  Verify that the number of CPUs offline has changed

When exiting:
  Kill the sar process



[Index of Archives]     [Linux Kernel]     [Linux DVB]     [Asterisk Internet PBX]     [DCCP]     [Netdev]     [X.org]     [Util Linux NG]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux