On 4/24/18 2:17 AM, Alexey Dobriyan wrote: > On Mon, Apr 23, 2018 at 10:21:01PM -0400, jeffm@xxxxxxxx wrote: >> Memory pressure isn't really an issue on this machine, so we >> end up using well over 100GB for proc files. > > Text files at scale! > >> With these patches applied, running the same testcase, the proc_inode >> cache only gets to about 600k objects, which is about 99.7% fewer. I >> get that procfs isn't supposed to be scalable, but this is kind of >> extreme. :)> > Easy stuff: > * all ->get_link hooks are broken in RCU lookup (use GFP_KERNEL), It's a pretty common pattern in the kernel, but it's just as easy to set inode->i_link during instantiation and keep RCU lookup. There aren't so many of these to make it a real burden on memory. > * "%.*s" for dentry names is probably unnecessary, > they're always NUL terminated Ack. > * kasprintf() does printing twice, since we're kind of care about /proc > performance, allocate for the worst case. Ack, integrated with ->get_link fix. > * "int nlinks = nlink_tgid;" > Unsigned police. Ack. nlink_t{,g}id are both u8, but it's easy to make it consistent. > * (inode->i_mode & S_IFLNK) > this is sketchy, S_ISLNK exists. > Ack. Notes of my own: proc_task_count_links also had the logic backward. It would add an extra link to the count for the symlink rather than the dir. proc_pid_files_revalidate only needs to check if the tasks share files since it won't be called if it's not a symlink. Thanks for the review, -Jeff -- Jeff Mahoney SUSE Labs