On Tue, Feb 18, 2025 at 3:10 PM Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> wrote: > > Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> writes: > > On Tue, Feb 11, 2025 at 3:59 PM Stephen Brennan > [...] > >> We can dust that off and include it for a new version of this series. > >> I'd be curious of what you'd like to see for kernel modules? A > >> three-level tree would be too complex, in my opinion. > > > > What is the use case for vars in kernel modules? > > The use case would be the same as for the core kernel. My primary > motivation is to allow drgn to understand the types of global variables, > and that extends to kernel modules too. > > >> module BTF size increased by 53.2%. > > > > This is the sum of all mods with vars divided by > > the sum of all mods without? > > That was a poorly done comparison, so let me provide this one that I did > using 6.13 and these patches. It was essentially a localmodconfig for a > VM instance, so I could still do better by picking a popular > distribution config. But I think this is far more representative. > > MODULE BASE COMP CHG PCT > drm.ko 115833 123410 7577 6.54% > iscsi_boot_sysfs.ko 2627 5380 2753 104.80% > joydev.ko 1816 2289 473 26.05% > libcxgbi.ko 24556 25266 710 2.89% > drm_vram_helper.ko 22325 22751 426 1.91% > nvme-tcp.ko 25044 25973 929 3.71% > vfat.ko 3448 3953 505 14.65% > btrfs.ko 275139 343686 68547 24.91% > libiscsi.ko 21177 21977 800 3.78% > xt_owner.ko 449 803 354 78.84% > nft_ct.ko 4912 6157 1245 25.35% > iscsi_ibft.ko 3967 4463 496 12.50% > pcspkr.ko 283 682 399 140.99% > crc32-pclmul.ko 390 771 381 97.69% > nf_conntrack.ko 23686 28191 4505 19.02% > iscsi_tcp.ko 16827 17750 923 5.49% > nft_fib.ko 835 1117 282 33.77% > nf_reject_ipv6.ko 699 981 282 40.34% > rfkill.ko 4233 6410 2177 51.43% > dm-region-hash.ko 6214 6496 282 4.54% > cxgb3i.ko 35469 37078 1609 4.54% > dm-mirror.ko 7576 8191 615 8.12% > pvpanic-pci.ko 174 574 400 229.89% > crct10dif-pclmul.ko 146 525 379 259.59% > nvme-fabrics.ko 17341 18124 783 4.52% > kvm-amd.ko 47302 51914 4612 9.75% > crc8.ko 221 405 184 83.26% > ib_iser.ko 27769 29116 1347 4.85% > sg.ko 4234 5656 1422 33.59% > intel_rapl_common.ko 5678 8446 2768 48.75% > bochs.ko 35643 36997 1354 3.80% > sha1-ssse3.ko 790 1305 515 65.19% > kvm-intel.ko 53802 59220 5418 10.07% > nft_chain_nat.ko 279 714 435 155.91% > vmlinux 5484970 7330096 1845126 33.64% > sha256-ssse3.ko 851 1378 527 61.93% > nf_nat.ko 6341 7240 899 14.18% > configs.ko 72 256 184 255.56% > xt_comment.ko 151 507 356 235.76% > ccp.ko 30433 34782 4349 14.29% > cxgb3.ko 44981 47504 2523 5.61% > crypto_simd.ko 1331 1613 282 21.19% > iptable_filter.ko 855 1456 601 70.29% > qedi.ko 70653 72786 2133 3.02% > drm_kms_helper.ko 63238 65000 1762 2.79% > cnic.ko 117074 117790 716 0.61% > failover.ko 780 1216 436 55.90% > nft_redir.ko 874 1529 655 74.94% > serio_raw.ko 708 1234 526 74.29% > nf_defrag_ipv6.ko 1520 2253 733 48.22% > nf_defrag_ipv4.ko 306 770 464 151.63% > nft_reject_ipv4.ko 517 939 422 81.62% > nft_nat.ko 1192 1732 540 45.30% > nft_reject_inet.ko 554 976 422 76.17% > fuse.ko 32181 41859 9678 30.07% > nft_compat.ko 3705 4404 699 18.87% > zstd_compress.ko 42597 43622 1025 2.41% > tls.ko 15140 20683 5543 36.61% > virtio_pci.ko 8456 9193 737 8.72% > blake2b_generic.ko 1364 1699 335 24.56% > cryptd.ko 3697 4297 600 16.23% > xor.ko 1358 1879 521 38.37% > intel_rapl_msr.ko 2851 3440 589 20.66% > kvm.ko 177060 256377 79317 44.80% > cxgb4.ko 215865 220844 4979 2.31% > bnx2i.ko 39524 41477 1953 4.94% > dm-round-robin.ko 1795 2123 328 18.27% > virtio_pci_legacy_dev.ko 909 1191 282 31.02% > qla4xxx.ko 79040 82694 3654 4.62% > nfs.ko 108350 169642 61292 56.57% > libata.ko 47301 66188 18887 39.93% > ghash-clmulni-intel.ko 578 997 419 72.49% > nf_reject_ipv4.ko 706 988 282 39.94% > nft_reject.ko 820 1196 376 45.85% > sunrpc.ko 127496 197841 70345 55.17% > nft_fib_ipv4.ko 803 1257 454 56.54% > scsi_transport_iscsi.ko 40419 57633 17214 42.59% > lockd.ko 36144 42137 5993 16.58% > drm_shmem_helper.ko 32555 33043 488 1.50% > nvme-core.ko 50275 58298 8023 15.96% > iw_cm.ko 13405 14796 1391 10.38% > mdio.ko 857 1041 184 21.47% > bnx2.ko 20354 21611 1257 6.18% > net_failover.ko 1742 2187 445 25.55% > ip_set.ko 11812 13093 1281 10.84% > libcxgb.ko 8698 8980 282 3.24% > dm-multipath.ko 8124 8898 774 9.53% > grace.ko 462 890 428 92.64% > virtio_net.ko 12322 14896 2574 20.89% > qed.ko 228735 232231 3496 1.53% > cdc-acm.ko 2923 3679 756 25.86% > i2c-piix4.ko 1124 2341 1217 108.27% > pvpanic-mmio.ko 177 625 448 253.11% > virtio_scsi.ko 3154 3898 744 23.59% > uio.ko 2602 4295 1693 65.07% > nft_fib_ipv6.ko 956 1410 454 47.49% > cec.ko 28370 29266 896 3.16% > qemu_fw_cfg.ko 1601 3476 1875 117.11% > ttm.ko 23672 25727 2055 8.68% > sd_mod.ko 9976 13030 3054 30.61% > xfs.ko 574594 926637 352043 61.27% > libiscsi_tcp.ko 17444 17911 467 2.68% > ib_cm.ko 32324 62373 30049 92.96% > aesni-intel.ko 3370 4922 1552 46.05% > drm_client_lib.ko 27449 27794 345 1.26% > virtio_pci_modern_dev.ko 2537 2819 282 11.12% > rdma_cm.ko 32504 51823 19319 59.44% > fat.ko 11958 13297 1339 11.20% > dm-log.ko 6529 6986 457 7.00% > pata_acpi.ko 9231 9700 469 5.08% > ata_piix.ko 10998 12598 1600 14.55% > ipt_REJECT.ko 956 1311 355 37.13% > drm_ttm_helper.ko 33160 33544 384 1.16% > be2iscsi.ko 55078 56993 1915 3.48% > i2c-smbus.ko 582 973 391 67.18% > cuse.ko 8435 9241 806 9.56% > nft_fib_inet.ko 579 995 416 71.85% > ib_core.ko 103656 123701 20045 19.34% > pulse8-cec.ko 9153 9890 737 8.05% > pvpanic.ko 494 1087 593 120.04% > dm-mod.ko 31377 35265 3888 12.39% > raid6_pq.ko 2774 4207 1433 51.66% > nft_reject_ipv6.ko 517 939 422 81.62% > cxgb4i.ko 47490 49021 1531 3.22% > ata_generic.ko 9008 9666 658 7.30% > vboxvideo.ko 47622 48844 1222 2.57% > ip_tables.ko 3109 3564 455 14.63% > > ALL MODS 9153268 11895301 2742033 29.96% > vmlinux 5484970 7330096 1845126 33.64% > TOTAL 14638238 19225397 4587159 31.34% > > So this shows a 1.8 MiB increase in vmlinux size, or 33.6%. > And for these modules in aggregate, an increase of 2.7 MiB or 30.0%. > > > Any outliers there? > > I would expect modules to have few global variables. > > In terms of outliers, there are groups that stand out to me: > > 1. Large percentage increases are usually always for modules that had > very tiny BTF before. The module system inherently creates a few > global variables for each module, so there's always a slight constant > increase of the BTF size (184 bytes, as far as I can tell), and in those > cases it can be a quite large percentage. Here's an example, > "configs.ko" which comes from the CONFIG_IKCONFIG enablement: > > BEFORE: > $ bpftool btf dump file ../build_pahole_novars/kernel/configs.ko -B ../build_pahole_novars/vmlinux > [127877] CONST '(anon)' type_id=11124 > [127878] ARRAY '(anon)' type_id=127877 index_type_id=21 nr_elems=1 > [127879] CONST '(anon)' type_id=127878 > > AFTER: > $ bpftool btf dump file ../build_pahole_vars/kernel/configs.ko -B ../build_pahole_vars/vmlinux > [162827] CONST '(anon)' type_id=11124 > [162828] ARRAY '(anon)' type_id=162827 index_type_id=21 nr_elems=1 > [162829] CONST '(anon)' type_id=162828 > [162830] VAR '____versions' type_id=162829, linkage=static > [162831] DATASEC '__versions' size=64 vlen=1 > type_id=162830 offset=0 size=64 (VAR '____versions') > [162832] VAR 'orc_header' type_id=8667, linkage=static > [162833] DATASEC '.orc_header' size=20 vlen=1 > type_id=162832 offset=0 size=20 (VAR 'orc_header') > [162834] VAR '__this_module' type_id=312, linkage=global > [162835] DATASEC '.gnu.linkonce.this_module' size=1344 vlen=1 > type_id=162834 offset=0 size=1344 (VAR '__this_module') > > What is, I think interesting, is that the types in that module were > totally useless to begin with, because they were used by a variable > which didn't even get emitted. So while this is a substantial > percentage-wise increase, I think it's a net improvement for this and > other modules. > > 2. The largest absolute increases come from large, complex modules like > xfs, kvm, sunrpc, btrfs, etc. For example, xfs had 5696 VAR > declarations. What is disappointing is how much of this is due to > automatically-generated "variables" from macros (e.g. tracepoints): > Here is a list of variable prefixes like that: > > print_fmt_* > trace_event_fields_* > trace_event_type_funcs_* > event_* > __SCK__tp_func_* > __bpf_trace_tp_map_* > __event_* > event_class_* > TRACE_SYSTEM_* > __TRACE_SYSTEM_* > __tracepoint_* > > These are, unfortunately, all valid declarations produced by macros and > they correspond to valid symbols as well. If you look at the kallsyms > for the modules (and core kernel), these variables are present there as > well. It may indeed make sense to have kallsyms entries for them: I > don't know. > > These are all, as far as I'm concerned, totally uninteresting types. If > you want to access any of this data, you probably already know its type > and wouldn't need a BTF declaration. Unfortunately, the flip side is > that I don't think we have a good way to automatically detect these, > outside of prefix matching, which quickly goes out of date as the kernel > changes, and can have false positives as well. For kernel modules, many > of these may appear in separate ELF sections, but for vmlinux, they > don't. I'd be happy to eliminate types for these auto-generated kinds of > variables, if we could somehow annotate them so that pahole knows to > ignore them. For instance, maybe we cauld use > > __attribute__((btf_decl_tag("btf_omit"))) > > as an instruction to pahole to omit declarations for these things? > All such tracepoint-related variables, can't we just put them into some separate ELF section, and teach pahole to ignore global variables from that section? btf_decl_tag is a similar idea, but (currently) won't work for GCC-built kernels. So I'd go with the ELF section. > Thanks, > Stephen > > > So before we decide on what to do with vars in mods lets figure out > > the need.