On Wed, Jun 15, 2022 at 10:00 AM Kalesh Singh <kaleshsingh@xxxxxxxxxx> wrote: > > On Wed, Jun 1, 2022 at 8:31 PM Kalesh Singh <kaleshsingh@xxxxxxxxxx> wrote: > > > > On Wed, Jun 1, 2022 at 8:02 AM Christian König > > <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > > > > > Am 01.06.22 um 00:48 schrieb Stephen Brennan: > > > > Kalesh Singh <kaleshsingh@xxxxxxxxxx> writes: > > > >> On Tue, May 31, 2022 at 3:07 PM Stephen Brennan > > > >> <stephen.s.brennan@xxxxxxxxxx> wrote: > > > >>> On 5/31/22 14:25, Kalesh Singh wrote: > > > >>>> In order to identify the type of memory a process has pinned through > > > >>>> its open fds, add the file path to fdinfo output. This allows > > > >>>> identifying memory types based on common prefixes. e.g. "/memfd...", > > > >>>> "/dmabuf...", "/dev/ashmem...". > > > >>>> > > > >>>> Access to /proc/<pid>/fdinfo is governed by PTRACE_MODE_READ_FSCREDS > > > >>>> the same as /proc/<pid>/maps which also exposes the file path of > > > >>>> mappings; so the security permissions for accessing path is consistent > > > >>>> with that of /proc/<pid>/maps. > > > >>> Hi Kalesh, > > > >> Hi Stephen, > > > >> > > > >> Thanks for taking a look. > > > >> > > > >>> I think I see the value in the size field, but I'm curious about path, > > > >>> which is available via readlink /proc/<pid>/fd/<n>, since those are > > > >>> symlinks to the file themselves. > > > >> This could work if we are root, but the file permissions wouldn't > > > >> allow us to do the readlink on other processes otherwise. We want to > > > >> be able to capture the system state in production environments from > > > >> some trusted process with ptrace read capability. > > > > Interesting, thanks for explaining. It seems weird to have a duplicate > > > > interface for the same information but such is life. > > > > > > Yeah, the size change is really straight forward but for this one I'm > > > not 100% sure either. > > > > The 2 concerns I think are: > > 1. Fun characters in the path names > > 2. If exposing the path is appropriate to begin with. > > > > One way I think we can address both is to only expose the path for > > anon inodes. Then we have well-known path formats and we don't expose > > much about which files a process is accessing since these aren't real > > paths. > > > > + if (is_anon_inode(inode)) { > > + seq_puts(m, "path:\t"); > > + seq_file_path(m, file, "\n"); > > + seq_putc(m, '\n'); > > + } > > > > Interested to hear thoughts on it. > > Adding Christoph, > > To be able to identify types of shared memory processes pin through > FDs in production builds, we would like to add a 'path' field to > fdinfo of anon inodes. We could then use the common prefixes > ("/dmabuf", "/memfd", ...) to identify different types. > > Would appreciate any feedback from the FS perspective. Hi all, If there are no objections to this, then I plan to respin the patch for just anonymous inodes. Please let me know if there are further concerns. Thanks, Kalesh > > Thanks, > Kalesh > > > > > > > > > Probably best to ping some core fs developer before going further with it. > > > > linux-fsdevel is cc'd here. Adding Al Vrio as well. Please let me know > > if there are other parties I should include. > > > > > > > > BTW: Any preferred branch to push this upstream? If not I can take it > > > through drm-misc-next. > > > > No other dependencies for this, so drm-misc-next is good. > > > > Thanks, > > Kalesh > > > > > > > > Regards, > > > Christian. > > > > > > > > > > >>> File paths can contain fun characters like newlines or colons, which > > > >>> could make parsing out filenames in this text file... fun. How would your > > > >>> userspace parsing logic handle "/home/stephen/filename\nsize:\t4096"? The > > > >>> readlink(2) API makes that easy already. > > > >> I think since we have escaped the "\n" (seq_file_path(m, file, "\n")), > > > > I really should have read through that function before commenting, > > > > thanks for teaching me something new :) > > > > > > > > Stephen > > > > > > > >> then user space might parse this line like: > > > >> > > > >> if (strncmp(line, "path:\t", 6) == 0) > > > >> char* path = line + 6; > > > >> > > > >> > > > >> Thanks, > > > >> Kalesh > > > >> > > > >>> Is the goal avoiding races (e.g. file descriptor 3 is closed and reopened > > > >>> to a different path between reading fdinfo and stating the fd)? > > > >>> > > > >>> Stephen > > > >>> > > > >>>> Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx> > > > >>>> --- > > > >>>> > > > >>>> Changes from rfc: > > > >>>> - Split adding 'size' and 'path' into a separate patches, per Christian > > > >>>> - Fix indentation (use tabs) in documentaion, per Randy > > > >>>> > > > >>>> Documentation/filesystems/proc.rst | 14 ++++++++++++-- > > > >>>> fs/proc/fd.c | 4 ++++ > > > >>>> 2 files changed, 16 insertions(+), 2 deletions(-) > > > >>>> > > > >>>> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > > > >>>> index 779c05528e87..591f12d30d97 100644 > > > >>>> --- a/Documentation/filesystems/proc.rst > > > >>>> +++ b/Documentation/filesystems/proc.rst > > > >>>> @@ -1886,14 +1886,16 @@ if precise results are needed. > > > >>>> 3.8 /proc/<pid>/fdinfo/<fd> - Information about opened file > > > >>>> --------------------------------------------------------------- > > > >>>> This file provides information associated with an opened file. The regular > > > >>>> -files have at least five fields -- 'pos', 'flags', 'mnt_id', 'ino', and 'size'. > > > >>>> +files have at least six fields -- 'pos', 'flags', 'mnt_id', 'ino', 'size', > > > >>>> +and 'path'. > > > >>>> > > > >>>> The 'pos' represents the current offset of the opened file in decimal > > > >>>> form [see lseek(2) for details], 'flags' denotes the octal O_xxx mask the > > > >>>> file has been created with [see open(2) for details] and 'mnt_id' represents > > > >>>> mount ID of the file system containing the opened file [see 3.5 > > > >>>> /proc/<pid>/mountinfo for details]. 'ino' represents the inode number of > > > >>>> -the file, and 'size' represents the size of the file in bytes. > > > >>>> +the file, 'size' represents the size of the file in bytes, and 'path' > > > >>>> +represents the file path. > > > >>>> > > > >>>> A typical output is:: > > > >>>> > > > >>>> @@ -1902,6 +1904,7 @@ A typical output is:: > > > >>>> mnt_id: 19 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: /dev/null > > > >>>> > > > >>>> All locks associated with a file descriptor are shown in its fdinfo too:: > > > >>>> > > > >>>> @@ -1920,6 +1923,7 @@ Eventfd files > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:[eventfd] > > > >>>> eventfd-count: 5a > > > >>>> > > > >>>> where 'eventfd-count' is hex value of a counter. > > > >>>> @@ -1934,6 +1938,7 @@ Signalfd files > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:[signalfd] > > > >>>> sigmask: 0000000000000200 > > > >>>> > > > >>>> where 'sigmask' is hex value of the signal mask associated > > > >>>> @@ -1949,6 +1954,7 @@ Epoll files > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:[eventpoll] > > > >>>> tfd: 5 events: 1d data: ffffffffffffffff pos:0 ino:61af sdev:7 > > > >>>> > > > >>>> where 'tfd' is a target file descriptor number in decimal form, > > > >>>> @@ -1968,6 +1974,7 @@ For inotify files the format is the following:: > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:inotify > > > >>>> inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d > > > >>>> > > > >>>> where 'wd' is a watch descriptor in decimal form, i.e. a target file > > > >>>> @@ -1992,6 +1999,7 @@ For fanotify files the format is:: > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:[fanotify] > > > >>>> fanotify flags:10 event-flags:0 > > > >>>> fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003 > > > >>>> fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4 > > > >>>> @@ -2018,6 +2026,7 @@ Timerfd files > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 0 > > > >>>> + path: anon_inode:[timerfd] > > > >>>> clockid: 0 > > > >>>> ticks: 0 > > > >>>> settime flags: 01 > > > >>>> @@ -2042,6 +2051,7 @@ DMA Buffer files > > > >>>> mnt_id: 9 > > > >>>> ino: 63107 > > > >>>> size: 32768 > > > >>>> + path: /dmabuf: > > > >>>> count: 2 > > > >>>> exp_name: system-heap > > > >>>> > > > >>>> diff --git a/fs/proc/fd.c b/fs/proc/fd.c > > > >>>> index 464bc3f55759..8889a8ba09d4 100644 > > > >>>> --- a/fs/proc/fd.c > > > >>>> +++ b/fs/proc/fd.c > > > >>>> @@ -60,6 +60,10 @@ static int seq_show(struct seq_file *m, void *v) > > > >>>> seq_printf(m, "ino:\t%lu\n", file_inode(file)->i_ino); > > > >>>> seq_printf(m, "size:\t%lli\n", (long long)file_inode(file)->i_size); > > > >>>> > > > >>>> + seq_puts(m, "path:\t"); > > > >>>> + seq_file_path(m, file, "\n"); > > > >>>> + seq_putc(m, '\n'); > > > >>>> + > > > >>>> /* show_fd_locks() never deferences files so a stale value is safe */ > > > >>>> show_fd_locks(m, file, files); > > > >>>> if (seq_has_overflowed(m)) > > > >>> -- > > > >>> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx. > > > >>> > > > > _______________________________________________ > > > > Linaro-mm-sig mailing list -- linaro-mm-sig@xxxxxxxxxxxxxxxx > > > > To unsubscribe send an email to linaro-mm-sig-leave@xxxxxxxxxxxxxxxx > > >