Re: [PATCH] modpost: Optimize symbol search from linear to binary search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 19, 2023 at 6:06 AM Jack Brennen <jbrennen@xxxxxxxxxx> wrote:
>
> Modify modpost to use binary search for converting addresses back
> into symbol references.  Previously it used linear search.
>
> This change saves a few seconds of wall time for defconfig builds,
> but can save several minutes on allyesconfigs.

Thanks.
Binary search is a good idea.


> Before:
> $ make LLVM=1 -j128 allyesconfig vmlinux -s KCFLAGS="-Wno-error"
>         Elapsed (wall clock) time (h:mm:ss or m:ss): 13:30.31

Instead of the time for the entire build,
can you put the time for the modpost command?

If you allyesconfig case,

 $ time scripts/mod/modpost -M -m -a -N -o vmlinux.symvers vmlinux.o





> diff --git a/scripts/mod/symsearch.c b/scripts/mod/symsearch.c
> new file mode 100644
> index 000000000000..aab79262512b
> --- /dev/null
> +++ b/scripts/mod/symsearch.c
> @@ -0,0 +1,233 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/* Helper functions for finding the symbol in an ELF which is "nearest"
> + * to a given address.
> + */
>

Can you use the following block comment style?

/*
 * Helper functions for finding the symbol in an ELF which is "nearest"
 * to a given address.
 */



> +#include "modpost.h"
> +
> +/* Struct used for binary search. */

I think this obvious comment is unneeded.



> +struct syminfo {
> +       unsigned int symbol_index;
> +       unsigned int section_index;
> +       Elf_Addr addr;
> +};
> +
> +/* Container used to hold an entire binary search table.
> + * Entries in table are ascending, sorted first by section_index,
> + * then by addr, and last by symbol_index.  The sorting by
> + * symbol_index is used to duplicate the quirks of the prior
> + * find_nearest_sym() function, where exact matches to an address
> + * return the first symtab entry seen, but near misses return the
> + * last symtab entry seen.

Preserving this quirk makes the code complicated.

I do not mind changing the behavior of the corner case.





> + * The first and last entries of the table are sentinels and their
> + * values only matter in two places:  when we sort the table, and
> + * on lookups, the end sentinel should not have an addr field which
> + * matches its immediate predecessor.  To meet these requirements,
> + * we initialize them to (0,0,0) and (max,max,max), and then after
> + * sorting, we tweak the end sentinel's addr field accordingly.
> + */
> +struct symsearch {
> +       size_t table_size;
> +       struct syminfo table[];
> +};



syminfo::symbol_index is unsigned int.
symsearch::table_size is size_t.


symbol_index of the last element is always larger than
elf->symsearch->table_size.

So, the code works only within 32-bit width anyway.












> +
> +static inline bool is_sym_searchable(struct elf_info *elf, Elf_Sym *sym)
> +{
> +       return is_valid_name(elf, sym) != 0;
> +}

If you call is_valid_name() directly, this function was unneeded?






> +
> +static int syminfo_compare(const void *s1, const void *s2)
> +{
> +       const struct syminfo *sym1 = s1;
> +       const struct syminfo *sym2 = s2;
> +
> +       if (sym1->section_index > sym2->section_index)
> +               return 1;
> +       if (sym1->section_index < sym2->section_index)
> +               return -1;
> +       if (sym1->addr > sym2->addr)
> +               return 1;
> +       if (sym1->addr < sym2->addr)
> +               return -1;
> +       if (sym1->symbol_index > sym2->symbol_index)
> +               return 1;
> +       if (sym1->symbol_index < sym2->symbol_index)
> +               return -1;
> +       return 0;
> +}
> +
> +static size_t symbol_count(struct elf_info *elf)
> +{
> +       size_t result = 0;
> +
> +       for (Elf_Sym *sym = elf->symtab_start; sym < elf->symtab_stop; sym++) {
> +               if (is_sym_searchable(elf, sym))
> +                       result++;
> +       }
> +       return result;
> +}
> +
> +/* Populate the search array that we just allocated.
> + * Be slightly paranoid here.  If the ELF file changes during processing,

I could not understand. In which case, the ELF file changes?

modpost loads the entire file to memory first..

In which scenario, the memory content changes?






> + * or if the behavior of is_sym_searchable() changes during processing,
> + * we want to catch it; neither of those is acceptable.
> + */
> +static void symsearch_populate(struct elf_info *elf,
> +                              struct syminfo *table,
> +                              size_t table_size)
> +{
> +       bool is_arm = (elf->hdr->e_machine == EM_ARM);
> +
> +       /* Start sentinel */
> +       if (table_size-- == 0)
> +               fatal("%s: size mismatch\n", __func__);
> +       table->symbol_index = 0;
> +       table->section_index = 0;
> +       table->addr = 0;
> +       table++;
> +
> +       for (Elf_Sym *sym = elf->symtab_start; sym < elf->symtab_stop; sym++) {
> +               if (is_sym_searchable(elf, sym)) {
> +                       if (table_size-- == 0)
> +                               fatal("%s: size mismatch\n", __func__);
> +                       table->symbol_index = sym - elf->symtab_start;
> +                       table->section_index = get_secindex(elf, sym);
> +                       table->addr = sym->st_value;
> +
> +                       /*
> +                        * For ARM Thumb instruction, the bit 0 of st_value is
> +                        * set if the symbol is STT_FUNC type. Mask it to get
> +                        * the address.
> +                        */
> +                       if (is_arm && ELF_ST_TYPE(sym->st_info) == STT_FUNC)
> +                               table->addr &= ~1;
> +
> +                       table++;
> +               }
> +       }
> +
> +       /* End sentinel; all values are unsigned so -1 wraps to max */
> +       if (table_size != 1)
> +               fatal("%s: size mismatch\n", __func__);
> +       table->symbol_index = -1;
> +       table->section_index = -1;
> +       table->addr = -1;
> +}
> +
> +void symsearch_init(struct elf_info *elf)
> +{
> +       /* +2 here to allocate space for the start and end sentinels */
> +       size_t table_size = symbol_count(elf) + 2;
> +
> +       elf->symsearch = NOFAIL(malloc(
> +                                       sizeof(struct symsearch) +
> +                                       sizeof(struct syminfo) * table_size));
> +       elf->symsearch->table_size = table_size;
> +
> +       symsearch_populate(elf, elf->symsearch->table, table_size);
> +       qsort(elf->symsearch->table, table_size,
> +             sizeof(struct syminfo), syminfo_compare);
> +
> +       /* A bit of paranoia; make sure that the end sentinel's address is
> +        * different than its predecessor.  Not doing this could cause
> +        * possible undefined behavior if anybody ever inserts a symbol
> +        * with section_index and addr both at their max values.

I could not understand this comment.

If section_index and addr both at their max values at [table_size - 2],
->table[table_size - 2].addr + 1 wraps to zero.

The table is not sorted any longer?




> +        * Doing this little bit of defensive programming is more efficient
> +        * than checking for array overruns later.
> +        */
> +       elf->symsearch->table[table_size - 1].addr =
> +               elf->symsearch->table[table_size - 2].addr + 1;
> +}
> +
> +void symsearch_finish(struct elf_info *elf)
> +{
> +       free(elf->symsearch);
> +       elf->symsearch = NULL;
> +}
> +
> +/* Find the syminfo which is in secndx and "nearest" to addr.
> + * allow_negative: allow returning a symbol whose address is > addr.
> + * min_distance: ignore symbols which are further away than this.
> + *
> + * Returns a nonzero index into the symsearch table for success.
> + * Returns NULL if no legal symbol is found within the requested range.
> + */
> +static size_t symsearch_find_impl(struct elf_info *elf, Elf_Addr addr,
> +                                 unsigned int secndx, bool allow_negative,
> +                                 Elf_Addr min_distance)
> +{
> +       /* Find the target in the array; it will lie between two elements.
> +        * Invariant here: table[lo] < target <= table[hi]
> +        * For the purposes of search, exact hits in the search array are
> +        * considered greater than the target.  This means that if we do
> +        * get an exact hit, then once the search terminates, table[hi]
> +        * will be the exact match which has the lowest symbol index.
> +        */
> +       struct syminfo *table = elf->symsearch->table;
> +       size_t hi = elf->symsearch->table_size - 1;
> +       size_t lo = 0;




The binary search code was implemented in a too complex way
to preserve the previous quirks.


I want to use the same comparison function for
qsort() and bsearch() to avoid paranoia.




How about this implementation?



static struct syminfo *symsearch_find_impl(struct elf_info *elf, Elf_Addr addr,
                                           unsigned int secndx, bool
allow_negative,
                                           Elf_Addr min_distance)
{
        struct syminfo target = { .symbol_index = -1, .section_index =
secndx, .addr = addr };
        struct syminfo *table = elf->symsearch->table;
        unsigned int hi = elf->symsearch->table_size - 1;
        unsigned int lo = 0;
        struct syminfo *result = NULL;
        Elf_Addr distance;

        while (lo < hi) {
                unsigned int mid = (lo + hi + 1) / 2;

                if (syminfo_compare(&table[mid], &target) > 0)
                        hi = mid - 1;
                else
                        lo = mid;
        }

        /*
         * The target resides between lo and (lo + 1).
         * If allow_negative is true, check both of them.
         */

        if (allow_negative && lo + 1 < elf->symsearch->table_size &&
            table[lo + 1].section_index == secndx) {
                distance = table[lo + 1].addr - addr;
                if (distance <= min_distance) {
                        min_distance = distance;
                        result = &table[lo + 1];
                }
        }

        if (table[lo].section_index == secndx) {
                distance = addr - table[lo].addr;
                if (distance <= min_distance)
                  result = &table[lo];
        }

        return result;
}

Elf_Sym *symsearch_find_nearest(struct elf_info *elf, Elf_Addr addr,
                                unsigned int secndx, bool allow_negative,
                                Elf_Addr min_distance)
{
        struct syminfo *result;

        result = symsearch_find_impl(elf, addr, secndx,
                                     allow_negative, min_distance);
        if (!result)
                return NULL;

        return &elf->symtab_start[result->symbol_index];
}



This does not preserve the previous quirks.

If there are multiple entries with the same address,
it always returns the last element.

I did not expect sentinels.

I did not do thorough tests, but it seems to be working for me.




Also, please call symsearch_find_nearest() directly
and remove symfind_nearest_sym().






--
Best Regards

Masahiro Yamada




[Index of Archives]     [Linux&nblp;USB Development]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite Secrets]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux