On Mon, 2010-06-21 at 15:00 +0300, Avi Kivity wrote: > On 06/21/2010 12:31 PM, Zhang, Yanmin wrote: > > The 2nd patch is to change the definition of perf_event to facilitate > > perf attr copy when a hypercall happens. > > > > Signed-off-by: Zhang Yanmin<yanmin_zhang@xxxxxxxxxxxxxxx> > > > > --- > > > > --- linux-2.6_tip0620/include/linux/perf_event.h 2010-06-21 15:19:52.821999849 +0800 > > +++ linux-2.6_tip0620perfkvm/include/linux/perf_event.h 2010-06-21 16:53:49.283999849 +0800 > > @@ -188,7 +188,10 @@ struct perf_event_attr { > > __u64 sample_type; > > __u64 read_format; > > > > > > Assuming these flags are available to the guest? These flags are used by generic perf codes. To match with host kernel, we wish guest os also use the flags. > > > - __u64 disabled : 1, /* off by default */ > > + union { > > + __u64 flags; > > + struct { > > + __u64 disabled : 1, /* off by default */ > > inherit : 1, /* children inherit it */ > > > > inherit is meaningless for a guest. Right. host kernel will reset it to 0 before create perf_event for guest os. Here we couldn't delete the flag as it's used by perf generic codes. I need separate the patch a little better. All definitions in include/linux/perf_event.h are mostly perf generic code related. I'm very careful to change it. > > > pinned : 1, /* must always be on PMU */ > > > > We cannot allow a guest to pin a counter. Ok. I will reset it in function kvm_pv_perf_op_open. > > The other flags are also problematic. I'd like to see virt-specific > flags (probably we'll only need kernel/user and nested_hv for nested > virtualization). How about to add more comments around struct guest_perf_attr->flags that guest os developers should take a look at include/linux/perf_event.h? BTW, I will reset more flags to 0 in kvm_pv_perf_op_open. > > Something that is worrying is that we don't expose group information. > perf will multiplex the events for us, but there will be a loss in accuracy. > > > #ifdef CONFIG_HAVE_HW_BREAKPOINT > > #include<asm/hw_breakpoint.h> > > #endif > > @@ -753,6 +752,20 @@ struct perf_event { > > > > perf_overflow_handler_t overflow_handler; > > > > + /* > > + * pointers used by kvm perf paravirt interface. > > + * > > + * 1) Used in host kernel and points to host_perf_shadow which > > + * has information about guest perf_event > > + */ > > + void *host_perf_shadow; > > > > Can we have real types instead of void pointers? I just want perf generic codes have less dependency on KVM codes. > > > + /* > > + * 2) Used in guest kernel and points to guest_perf_shadow which > > + * is used as a communication area with host kernel. Host kernel > > + * copies overflow data to it when an event overflows. > > + */ > > + void *guest_perf_shadow; > > > > It's strange to see both guest and host parts in the same patch. > Splitting to separate patches will really help review. It's a little hard to split the patches if they change the same file. Perhaps I could add more statements before the patch when I send it out. > > > @@ -1626,9 +1629,22 @@ void perf_event_task_tick(struct task_st > > if (ctx&& ctx->nr_events&& ctx->nr_events != ctx->nr_active) > > rotate = 1; > > > > - perf_ctx_adjust_freq(&cpuctx->ctx); > > - if (ctx) > > - perf_ctx_adjust_freq(ctx); > > +#ifdef CONFIG_KVM_PERF > > + if (kvm_para_available()) { > > + /* > > + * perf_ctx_adjust_freq causes lots of pmu->read which would > > + * trigger too many vmexit to host kernel. We disable it > > + * under para virt situation > > + */ > > + adjust_freq = 0; > > + } > > +#endif > > > > Perhaps we can have a batch read interface which will read many counters > at once. It's a good idea. But that will touch many perf generic codes which causes it's hard to maintain or follow future changes. > This would reduce the number of exits. Also adjust the > frequency less frequently. Here it depends on process scheduler frequency, CONFIG_HZ. > > > + > > + if (adjust_freq) { > > + perf_ctx_adjust_freq(&cpuctx->ctx); > > + if (ctx) > > + perf_ctx_adjust_freq(ctx); > > + } > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html