On Thu, 2005-03-10 at 09:26, Mark Wong wrote: > Here's another version, fleshed out a little more from Nathan's > comments. > > > Test Case 1: > What happens to disk controller interrupts when you offline a CPU on > a multiprossor system? > > 1. Note the current smp_affinity mask for the disk controller to stress. > > Set the IRQ smp_affinity mask for the disk controller to all CPU's. > > Echo the appropriate hex mask into /proc/irq/IRQ#/smp_affinity > > Verify the smp_affinity mask. > > 2. Start watching the interrupt counts in /proc/interrupts. > > Is it worth verifying tools such as sar at the same time? IMHO, yes, but don't depend on it in case it breaks. > > 3. Start writing to a disk. > > while true; do echo 1 > dud; sleep 1; done I think the disk activity needs to be much more active than this. Start with this to get the test going, but leave a way to start much more activity, like parallel reads _and writes going on. Also, if ultimately we can run it on a system with more than one interface active, that would be a good idea. > > Suggestions for what to do in order to be able to verify all writes > are completed and correct? Is this a possibility: Use a pattern for a write, then compare the resulting file when the io finishes for differences. If you read a file with a known pattern, you can test it as you read it. > > 4. Offline a CPU, pick on cpu1. > > echo 0 > /sys/devices/system/cpu/cpu1/online > > cpu0 is not hotswappable on some architectures and will not have an online > attribute. Do we want to try to offline cpu0 on such an architecture and make sure the echo does not return "0"? > > Can we pinpoint when a CPU goes offline? > > It's my understanding that timeslice overrun prevents > 'time echo 0 > /sys/devices/system/cpu/cpu1/online' from being an > accurate measure of how long it takes to offline a CPU. > > A turn of 0 (zero) signified the successful complettion of offlining the > CPU from the kernel's point of view. > > Verify the smp_affinity mask of the affected disk controller. > > 5. Analyze data collected from /proc/interrupts? > > Relevent messages in /var/log/messages regarding the procedure will occur > depending on the architecture tested on. > > ______________________________________________________________________ > _______________________________________________ > Hotplug_sig mailing list > Hotplug_sig@xxxxxxxxxxxxxx > http://lists.osdl.org/mailman/listinfo/hotplug_sig -- Mary Edie Meredith maryedie@xxxxxxxx 503-906-1942 Open Source Development Labs