On 04/24/2014 03:14 AM, Mel Gorman wrote: > On Mon, Apr 21, 2014 at 11:24:25AM -0700, Dave Hansen wrote: >> @@ -105,9 +108,10 @@ static void flush_tlb_func(void *info) >> >> count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); >> if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) { >> - if (f->flush_end == TLB_FLUSH_ALL) >> + if (f->flush_end == TLB_FLUSH_ALL) { >> local_flush_tlb(); >> - else if (!f->flush_end) >> + trace_tlb_flush(TLB_REMOTE_SHOOTDOWN, TLB_FLUSH_ALL); >> + } else if (!f->flush_end) >> __flush_tlb_single(f->flush_start); >> else { >> unsigned long addr; > > Why is only the TLB_FLUSH_ALL case traced here and not the single flush > or range of flushes? __native_flush_tlb_single() doesn't have a trace > point so I worry we are missing visibility on this part in particular > this part. > > while (addr < f->flush_end) { > __flush_tlb_single(addr); > addr += PAGE_SIZE; > } You're right, I missed that bit. I've corrected in a later version of the patch. >> @@ -152,7 +156,9 @@ void flush_tlb_current_task(void) >> preempt_disable(); >> >> count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ALL); >> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN, TLB_FLUSH_ALL); >> local_flush_tlb(); >> + trace_tlb_flush(TLB_LOCAL_SHOOTDOWN_DONE, TLB_FLUSH_ALL); >> if (cpumask_any_but(mm_cpumask(mm), smp_processor_id()) < nr_cpu_ids) >> flush_tlb_others(mm_cpumask(mm), mm, 0UL, TLB_FLUSH_ALL); >> preempt_enable(); > > Are the two tracepoints really useful? Are they fine enough to measure > the cost of the TLB flush? It misses the refill obviously but not much > we can do there. It's fine enough, but I did realize over time that the cost of the tracepoint is about 3x the cost of a 1-page tlb flush itself, so these are unusable for detailed measurements. I'll remove it for now. >> #endif /* _LINUX_MM_TYPES_H */ >> diff -puN /dev/null include/trace/events/tlb.h >> --- /dev/null 2014-04-10 11:28:14.066815724 -0700 >> +++ b/include/trace/events/tlb.h 2014-04-21 11:10:35.529868198 -0700 >> @@ -0,0 +1,37 @@ >> +#undef TRACE_SYSTEM >> +#define TRACE_SYSTEM tlb >> + >> +#if !defined(_TRACE_TLB_H) || defined(TRACE_HEADER_MULTI_READ) >> +#define _TRACE_TLB_H >> + >> +#include <linux/mm_types.h> >> +#include <linux/tracepoint.h> >> + >> +extern const char * const tlb_flush_reason_desc[]; >> + >> +TRACE_EVENT(tlb_flush, >> + >> + TP_PROTO(int reason, unsigned long pages), >> + TP_ARGS(reason, pages), >> + >> + TP_STRUCT__entry( >> + __field( int, reason) >> + __field(unsigned long, pages) >> + ), >> + >> + TP_fast_assign( >> + __entry->reason = reason; >> + __entry->pages = pages; >> + ), >> + >> + TP_printk("pages: %ld reason: %d (%s)", >> + __entry->pages, >> + __entry->reason, >> + tlb_flush_reason_desc[__entry->reason]) >> +); >> + > > I would also suggest you match the output formatting with writeback.h > which would look like > > pages:%lu reason:%s > > The raw format should still have the integer while the string formatting > would have something human readable. I can do that. The only bummer with the human-readable strings is turning them back in to something that the filters can take. I think I'll just do: + TP_printk("pages:%ld reason:%s (%d)", + __entry->pages, + __print_symbolic(__entry->reason, TLB_FLUSH_REASON), + __entry->reason) +); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>