On 11/16/20 7:54 AM, Matthew Wilcox wrote: > It gets even more complicated with CPUs with multiple levels of TLB > which support different TLB entry sizes. My CPU reports: > > TLB info > Instruction TLB: 2M/4M pages, fully associative, 8 entries > Instruction TLB: 4K pages, 8-way associative, 64 entries > Data TLB: 1GB pages, 4-way set associative, 4 entries > Data TLB: 4KB pages, 4-way associative, 64 entries > Shared L2 TLB: 4KB/2MB pages, 6-way associative, 1536 entries It's even "worse" on recent AMD systems. Those will coalesce multiple adjacent PTEs into a single TLB entry. I think Alphas did something like this back in the day with an opt-in. Anyway, the changelog should probably replace: > This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate TLB > page sizes. with something more like: This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate page table mapping sizes. That's really the best we can do from software without digging into microarchitecture-specific events.