On Wed, Jan 27, 2021 at 5:47 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > +jeffv from Android > > On Tue, Jan 26, 2021 at 11:51 PM Kalesh Singh <kaleshsingh@xxxxxxxxxx> wrote: > > In order to measure how much memory a process actually consumes, it is > > necessary to include the DMA buffer sizes for that process in the memory > > accounting. Since the handle to DMA buffers are raw FDs, it is important > > to be able to identify which processes have FD references to a DMA buffer. > > Or you could try to let the DMA buffer take a reference on the > mm_struct and account its size into the mm_struct? That would probably > be nicer to work with than having to poke around in procfs separately > for DMA buffers. > > > Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and > > /proc/<pid>/fdinfo -- both of which are only root readable, as follows: > > That's not quite right. They can both also be accessed by the user > owning the process. Also, fdinfo is a standard interface for > inspecting process state that doesn't permit reading process memory or > manipulating process state - so I think it would be fine to permit > access to fdinfo under a PTRACE_MODE_READ_FSCRED check, just like the > interface you're suggesting. Hi everyone. Thank you for the feedback. I understand there is a deeper problem of accounting shared memory in the kernel, that’s not only specific to the DMA buffers. In this case DMA buffers, I think Jann’s proposal is the cleanest way to attribute the shared buffers to processes. I can respin a patch modifying fdinfo as suggested, if this is not an issue from a security perspective. Thanks, Kalesh > > > > 1. Do a readlink on each FD. > > 2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD. > > 3. stat the file to get the dmabuf inode number. > > 4. Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size. > > > > Android captures per-process system memory state when certain low memory > > events (e.g a foreground app kill) occur, to identify potential memory > > hoggers. To include a process’s dmabuf usage as part of its memory state, > > the data collection needs to be fast enough to reflect the memory state at > > the time of such events. > > > > Since reading /proc/<pid>/fd/ and /proc/<pid>/fdinfo/ requires root > > privileges, this approach is not suitable for production builds. > > It should be easy to add enough information to /proc/<pid>/fdinfo/ so > that you don't need to look at /proc/<pid>/fd/ anymore. > > > Granting > > root privileges even to a system process increases the attack surface and > > is highly undesirable. Additionally this is slow as it requires many > > context switches for searching and getting the dma-buf info. > > What do you mean by "context switches"? Task switches or kernel/user > transitions (e.g. via syscall)? > > > With the addition of per-buffer dmabuf stats in sysfs [1], the DMA buffer > > details can be queried using their unique inode numbers. > > > > This patch proposes adding a /proc/<pid>/task/<tid>/dmabuf_fds interface. > > > > /proc/<pid>/task/<tid>/dmabuf_fds contains a list of inode numbers for > > every DMA buffer FD that the task has. Entries with the same inode > > number can appear more than once, indicating the total FD references > > for the associated DMA buffer. > > > > If a thread shares the same files as the group leader then its > > dmabuf_fds file will be empty, as these dmabufs are reported by the > > group leader. > > > > The interface requires PTRACE_MODE_READ_FSCRED (same as /proc/<pid>/maps) > > and allows the efficient accounting of per-process DMA buffer usage without > > requiring root privileges. (See data below) > > I'm not convinced that introducing a new procfs file for this is the > right way to go. And the idea of having to poke into multiple > different files in procfs and in sysfs just to be able to compute a > proper memory usage score for a process seems weird to me. "How much > memory is this process using" seems like the kind of question the > kernel ought to be able to answer (and the kernel needs to be able to > answer somewhat accurately so that its own OOM killer can do its job > properly)?