Vincent, On Mon, 28 Feb 2011, Vincent Guittot wrote: > On 24 February 2011 19:40, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > On Thu, 24 Feb 2011, Vincent Guittot wrote: Sorry, this mail got eaten somehow, but I got it from my archive. > >> #ifdef CONFIG_SMP > >> /* Serializes the updates to cpu_online_mask, cpu_present_mask */ > >> static DEFINE_MUTEX(cpu_add_remove_lock); > >> @@ -197,10 +200,13 @@ struct take_cpu_down_param { > >> static int __ref take_cpu_down(void *_param) > >> { > >> struct take_cpu_down_param *param = _param; > >> + unsigned int cpu = (unsigned int)(param->hcpu); > >> int err; > >> > >> /* Ensure this CPU doesn't handle any more interrupts. */ > >> + trace_cpu_hotplug_disable_start(cpu); > >> err = __cpu_disable(); > >> + trace_cpu_hotplug_disable_end(cpu); > > > > How useful. What about recording the return code of __cpu_disable()? > > > > The goal is to monitor the cpu hotplug activity and duration. I want > to detect 2 kind of cpu_down/cpu_up call, ones which succeed to > unplug/plug a core and ones which don't. But I'm not sure that we need > to sort the failed calls into to the trace. We trace them because too > much fails could point out a bug or a wrong use of cpu hotplug. This does not make sense at all. You want to see the failures, then recording the error code makes even more sense. Your way of decoding the error case by checking whether the next trace entry is there or missing is just sloppy. > >> if (err < 0) > >> return err; > > > >> + trace_cpu_hotplug_down_start(cpu); > >> + > > > > What's the point of this tracepoint _BEFORE_ the cpu_hotplug_disabled > > check without recording cpu_hotplug_disabled ? > > > > I want to trace all cpu_down call even those which returns immediately > which will be part of the failed calls. Decoding failure from missing entries is simply wrong. > >> if (cpu_hotplug_disabled) { > >> err = -EBUSY; > >> goto out; > >> @@ -284,6 +294,8 @@ int __ref cpu_down(unsigned int cpu) > >> err = _cpu_down(cpu, 0); > >> > >> out: > >> + trace_cpu_hotplug_down_end(cpu); > > > > And this one is misplaced as well. It wants to be only called when we > > actually called _cpu_down() and it wants to record the return code as > > well. > > > > It has been placed here to be called each time > trace_cpu_hotplug_down_start is called. That does not make sense at all. You cannot distinguish between cpu_hotplug_disabled set and the _cpu_down() being called case. Or do you want to tell me that you decode that from the time stamps? Hell no. We want traces which are readable w/o crystal balls. Thanks, tglx