On Wed, Jan 11, 2023 at 8:45 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Wed, Jan 11, 2023 at 01:54:54PM +0100, Peter Zijlstra wrote: > > On Tue, Jan 10, 2023 at 12:06:00PM -0800, Namhyung Kim wrote: > > > > > Another example, but in this case it's real, is ADDR. We cannot update > > > the data->addr just because filtered_sample_type has PHYS_ADDR or > > > DATA_PAGE_SIZE as it'd lose the original value. > > > > Hmm, how about something like so? > > > > /* > > * if (flags & s) flags |= d; // without branches > > */ > > static __always_inline unsigned long > > __cond_set(unsigned long flags, unsigned long s, unsigned long d) > > { > > return flags | (d * !!(flags & s)); > > } > > > > Then: > > > > fst = sample_type; > > fst = __cond_set(fst, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP); > > fst = __cond_set(fst, PERF_SAMPLE_DATA_PAGE_SIZE | > > PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR); > > fst = __cond_set(fst, PERF_SAMPLE_STACK_USER, PERF_SAMPLE_REGS_USER); > > fst &= ~data->sample_flags; > > > > Hmm, I think it's better to write this like: > > static __always_inline unsigned long > __cond_set(unsigned long flags, unsigned long s, unsigned long d) > { > return d * !!(flags & s); > } > > fst = sample_type; > fst |= __cond_set(sample_type, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP); > fst |= __cond_set(sample_type, PERF_SAMPLE_DATA_PAGE_SIZE | > PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR); > fst |= __cond_set(sample_type, PERF_SAMPLE_STACK_USER, PERF_SAMPLE_REGS_USER); > fst &= ~data->sample_flags; > > Which should be identical but has less data dependencies and thus gives > an OoO CPU more leaway to paralleize things. Looks good. Let me try this. Thanks, Namhyung