Re: [PATCHv3 bpf-next 0/9] mm/bpf/perf: Store build id in file object

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Sat, Mar 18, 2023 at 03:16:45PM +0000, Matthew Wilcox escreveu:
> On Sat, Mar 18, 2023 at 09:33:49AM +0100, Jiri Olsa wrote:
> > On Thu, Mar 16, 2023 at 05:34:41PM +0000, Matthew Wilcox wrote:
> > > On Thu, Mar 16, 2023 at 06:01:40PM +0100, Jiri Olsa wrote:
> > > > hi,
> > > > this patchset adds build id object pointer to struct file object.
> > > > 
> > > > We have several use cases for build id to be used in BPF programs
> > > > [2][3].
> > > 
> > > Yes, you have use cases, but you never answered the question I asked:
> > > 
> > > Is this going to be enabled by every distro kernel, or is it for special
> > > use-cases where only people doing a very specialised thing who are
> > > willing to build their own kernels will use it?
> > 
> > I hope so, but I guess only time tell.. given the response by Ian and Andrii
> > there are 3 big users already
> 
> So the whole "There's a config option to turn it off" shtick is just a
> fig-leaf.  I won't ever see it turned off.  You're imposing the cost of
> this on EVERYONE who runs a distro kernel.  And almost nobody will see
> any benefits from it.  Thanks for admitting that.

I agree that build-ids are not useful for all 'struct file' uses, just
for executable files and for people wanting to have better observability
capabilities.

Having said that, it seems there will be no extra memory overhead at
least for a fedora:36 x86_64 kernel:

void __init files_init(void)
{
        filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
                        SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT, NULL);
        percpu_counter_init(&nr_files, 0, GFP_KERNEL);
}

[root@quaco ~]# pahole file | grep size: -A2
	/* size: 232, cachelines: 4, members: 20 */
	/* sum members: 228, holes: 1, sum holes: 4 */
	/* last cacheline: 40 bytes */
[acme@quaco perf-tools]$ uname -a
Linux quaco 6.1.11-100.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb  9 20:36:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@quaco ~]# head -2 /proc/slabinfo 
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
[root@quaco ~]# grep -w filp /proc/slabinfo 
filp               12452  13056    256   32    2 : tunables    0    0    0 : slabdata    408    408      0
[root@quaco ~]#

so there are 24 bytes on the 4th cacheline that are not being used,
right?

One other observation is that maybe we could do it as the 'struct sock'
hierachy in networking, where we would have a 'struct exec_file' that
would be:

	struct exec_file {
		struct file file;
		char build_id[20];
	}

say, and then when we create the 'struct file' in __alloc_file() we
could check some bit in 'flags' like Al Viro suggested and pick a
different slab than 'filp_cachep', that has that extra space for the
build_id (and whatever else exec related state we may end up wanting, if
ever).

No core fs will need to know about that except when we go free it, to
free from the right slab cache.

In current distro configs, no overhead would take place if I read that
SLAB_HWCACHE_ALIGN thing right, no?

- Arnaldo



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux