Re: [PATCH v3] perf: map pages in advance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23.12.24 12:10, Lorenzo Stoakes wrote:
Peter - could you drop this patch for now until I have a chance to take a
look at this issue on my return on 2nd Jan?

On Fri, Dec 20, 2024 at 10:53:14PM +0100, David Hildenbrand wrote:
On 20.12.24 22:29, David Hildenbrand wrote:
On 20.12.24 20:36, Chen, Zide wrote:


On 12/20/2024 1:56 AM, David Hildenbrand wrote:
On 20.12.24 10:31, Lorenzo Stoakes wrote:
On Thu, Dec 19, 2024 at 01:17:44PM -0800, Chen, Zide wrote:

With this patch, it seems perf tool has some problems in capturing the
kernel data with Intel PT.

Running the following commands, the size of perf.data is very small, and
perf script can't find any valid records.

perf record -e intel_pt//u -- /bin/ls
perf script --insn-trace



Hi,

I'm on leave (and should really go back to relaxing :>), returning on 2nd
Jan so can't really dig into this.

But I tried it on my intel box and it 'works on my machine' with and
without patch with commands provided, so I'm not sure this is actually a
product of this change (which shouldn't impact this).

Zide Chen, can you try with and without this patch to see if it
introduces the issue?

Yes, I re-did the test on a SPR server, and the result is same. Without
the patch, it went well; But with it, "perf script --insn-trace" doesn't
show valid records.

This time I tested it on the clean 6.13-rc1 tag, base commit
40384c840ea1944d7c5a392e8975ed088ecf0b37

Also, with this patch, running tools/perf/tests/shell/test_intel_pt.sh:

Error:
The - data has no samples!

I just tested it on 6.13-rc1  vs. 6.13-rc1  with this patch.

Indeed, there is quite difference. Below are the main parts that changed, only.

We seem to be recording data, but maybe what we record gets corrupted somehow?

Huge parts of the new file are full of 0s. Either we are mapping the wrong
pages, or reading from the pages (via PFNMAP) does not work as expected.


Thanks David, and apologies Zide, appears there is an issue here clearly.

Could you try this with sudo operations? I was doing this locally and I
wonder if there is now a permissioning error?

I ran it under root.


I'd be surprised if pfn map would cause an issue here as it should just
directly map the kernel memory, however if the PT code assumes there will
be faults there could be an issue. I did take a brief look at this last
week and it seems the PT stuff relies on the aux functionality, so that
could also be a source of problems here.

I started a bit at that code, no clue yet what's happening.

I was wondering if we end up mapping the wrong pages, meaning: the pages at mmap time end up being different to the pages later at fault time. The code is a bit confusing, but I thought we cannot change the effective event/pages while we have an active mmap. Maybe there is some corner case ...

Nothing else really jumped at me ... moving the mapping og pages after the event_mapped() callback also didn't change anything.


I am on leave at the moment returning on 2nd Jan, I will look at this as a
priority when I return, as you can see above I've asked Peter to drop this
for now.

Enjoy your time off an Happy Holidays!

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux