[Hotplug_sig] Automated Hotplug CPU Regression Test Cases, Take 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2005-03-10 at 09:26, Mark Wong wrote:
> Here's another version, fleshed out a little more from Nathan's
> comments.
> 
> 
> Test Case 1:
> What happens to disk controller interrupts when you offline a CPU on
> a multiprossor system?
> 
> 1. Note the current smp_affinity mask for the disk controller to stress.
> 
>    Set the IRQ smp_affinity mask for the disk controller to all CPU's.
> 
>    Echo the appropriate hex mask into /proc/irq/IRQ#/smp_affinity
> 
>    Verify the smp_affinity mask.
> 
> 2. Start watching the interrupt counts in /proc/interrupts.
> 
>    Is it worth verifying tools such as sar at the same time?
IMHO, yes, but don't depend on it in case it breaks.
> 
> 3. Start writing to a disk.
> 
>    while true; do echo 1 > dud; sleep 1; done
I think the disk activity needs to be much more active than this.  Start
with this to get the test going, but leave a way to start much more
activity, like parallel reads _and writes going on.  Also, if ultimately
we can run it on a system with more than one interface active, that
would be a good idea.


> 
>    Suggestions for what to do in order to be able to verify all writes
>    are completed and correct?

Is this a possibility:
Use a pattern for a write, then compare the resulting file when the io
finishes for differences.  If you read a file with a known pattern, you
can test it as you read it.  
> 
> 4. Offline a CPU, pick on cpu1.
> 
>    echo 0 > /sys/devices/system/cpu/cpu1/online
> 
>    cpu0 is not hotswappable on some architectures and will not have an online
>    attribute.
Do we want to try to offline cpu0 on such an architecture and make sure
the echo does not return "0"?
> 
>    Can we pinpoint when a CPU goes offline?
> 
>    It's my understanding that timeslice overrun prevents
>    'time echo 0 > /sys/devices/system/cpu/cpu1/online' from being an
>    accurate measure of how long it takes to offline a CPU.
> 
>    A turn of 0 (zero) signified the successful complettion of offlining the
>    CPU from the kernel's point of view.
> 
>    Verify the smp_affinity mask of the affected disk controller.
> 
> 5. Analyze data collected from /proc/interrupts?
> 
>    Relevent messages in /var/log/messages regarding the procedure will occur
>    depending on the architecture tested on.


> 
> ______________________________________________________________________
> _______________________________________________
> Hotplug_sig mailing list
> Hotplug_sig@xxxxxxxxxxxxxx
> http://lists.osdl.org/mailman/listinfo/hotplug_sig
-- 
Mary Edie Meredith 
maryedie@xxxxxxxx
503-906-1942
Open Source Development Labs


[Index of Archives]     [Linux Kernel]     [Linux DVB]     [Asterisk Internet PBX]     [DCCP]     [Netdev]     [X.org]     [Util Linux NG]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux