On Tue, 24 Sep 2019 20:26:36 +0900, Sahibzada Irfanullah said: > After having a reasonable amount of log data, If you're trying to figure out how the kernel memory manager is working, you're probably better off using 'perf' or one of the other tracing tools already in the kernel to track the kernel memory manager. For starters, you can get those tools to give you things like stack tracebacks so you know who is asking for a page, and who is *releasing* a page, and so on. Of course, which of these tools to use depends on what data you need to answer the question - but simply knowing what physical address was involved in a page fault is almost certainly not going to be sufficient. > I want to perform some type of analsys at run time, e.g., no. of unique > addresses, total no. of addresses, frequency of occurences of each addresses > etc. So what "some type of analysis" are you trying to do? What question(s) are you trying to answer? The number of unique physical addresses in your system is dictated by how much RAM you have installed. Similarly for total number of addresses, although I'm not sure why you list both - that would mean that there is some number of non-unique addresses. What would that even mean? The number of pages actually available for paging and caching depends on other things as well - the architecture of the system, how much RAM (if any) is reserved for use by your video card, the size of the kernel, the size of loaded modules, space taken up by kmalloc allocations, page tables, whether any processes have called mlock() on a large chunk of space, whether the pages are locked by the kernel because there's I/O going on, and then there's things like mmap(), and so on. The kernel provides /proc/meminfo and /proc/slabinfo - you're going to want to understand all that stuff before you can make sense of anything. Simply looking at the frequency of occurrences of each address is probably not going to tell you much of anything, because you need to know things like the total working and resident set sizes for the process and other context. For example - you do the analysis, and find that there are 8 gigabytes of pages that are constantly being re-used. But that doesn't tell you if there are two processes that are thrashing against each other because each is doing heavy repeated referencing of 6 gigabytes of data, or if one process is wildly referencing many pages because some programmer has a multi-dimensional array and is walking across the array with the indices in the wrong order i_max = 4095; j_max = 4095; for (i = 0, i < i_max; i++) for j = 0, j < j_max; j++) {sum += foo[i][j]} If somebdy is doing foo[j][i] instead, things can get ugly. And if you're mixing with Fortran code, where the semantics of array references is reverse and you *want* to use 'foo[j][i]' for efficient memory access, it's a bullet loaded in the chamber and waiting for somebody to pull the trigger. Not that I've ever seen *that* particular error happen with a programmer processing 2 terabytes of arrays on a machine that only had 1.5 terabytes of RAM. But I did tease the person involved about it, because they *really* should have known better. :) So again: What question(s) are you trying to get answers to?
Attachment:
pgptr648_MyyI.pgp
Description: PGP signature
_______________________________________________ Kernelnewbies mailing list Kernelnewbies@xxxxxxxxxxxxxxxxx https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies