Re: [PATCH dwarves 4/5] btf_encoder: represent "."-suffixed optimized functions (".isra.0") in BTF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/01/2023 17:54, Kui-Feng Lee wrote:
> 
> On 1/24/23 05:45, Alan Maguire wrote:
>> +/*
>> + * static functions with suffixes are not added yet - we need to
>> + * observe across all CUs to see if the static function has
>> + * optimized parameters in any CU, since in such a case it should
>> + * not be included in the final BTF.  NF_HOOK.constprop.0() is
>> + * a case in point - it has optimized-out parameters in some CUs
>> + * but not others.  In order to have consistency (since we do not
>> + * know which instance the BTF-specified function signature will
>> + * apply to), we simply skip adding functions which have optimized
>> + * out parameters anywhere.
>> + */
>> +static int32_t btf_encoder__save_func(struct btf_encoder *encoder, struct function *fn)
>> +{
>> +    struct btf_encoder *parent = encoder->parent ? encoder->parent : encoder;
>> +    const char *name = function__name(fn);
>> +    struct function **nodep;
>> +    int ret = 0;
>> +
>> +    pthread_mutex_lock(&parent->saved_func_lock);
> 
> Do you have the number of static functions with suffices?
> 

There are a few thousand, and around 25000 static functions
overall ("."-suffixed are all static) that will participate in
the tree representations (see patch 5).  This equates to roughly 
half of the vmlinux BTF functions.

> If the number of static functions with suffices is high, the contention of the lock would be an issue.
> 
> Is it possible to keep a local pool of static functions with suffices? The pool will be combined with its parent either at the completion of a CU, before ending the thread or when merging into the main thread.
>

It's possible alright. I'll try to lay out the possibilities so we
can figure out the best way forward.

Option 1: global tree of static functions, created during DWARF loading

Pro: Quick addition/lookup, we can flag optimizations or inconsistent prototypes as
we encounter them.
Con: Lock contention between encoder threads.

Option 2: store static functions in a per-encoder tree, traverse them all
prior to BTF merging to eliminate unwanted functions

Pro: limits contention.
Con: for each static function in each encoder, we need to look it up in all other
encoder trees. In option 1 we paid that price as the function was added, here
we pay it later on prior to merging. So processing here is 
O(number_functions * num_encoders). There may be a cleverer way to handle
this but I can't see it right now.

There may be other approaches to this of course, but these were the two I
could come up with. What do you think?

Alan

> 
>> +    nodep = tsearch(fn, &parent->saved_func_tree, function__compare);
>> +    if (nodep == NULL) {
>> +        fprintf(stderr, "error: out of memory adding local function '%s'\n",
>> +            name);
>> +        ret = -1;
>> +        goto out;
>> +    }
>> +    /* If we find an existing entry, we want to merge observations
>> +     * across both functions, checking that the "seen optimized-out
>> +     * parameters" status is reflected in our tree entry.
>> +     * If the entry is new, record encoder state required
>> +     * to add the local function later (encoder + type_id_off)
>> +     * such that we can add the function later.
>> +     */
>> +    if (*nodep != fn) {
>> +        (*nodep)->proto.optimized_parms |= fn->proto.optimized_parms;
>> +    } else {
>> +        struct btf_encoder_state *state = zalloc(sizeof(*state));
>> +
>> +        if (state == NULL) {
>> +            fprintf(stderr, "error: out of memory adding local function '%s'\n",
>> +                name);
>> +            ret = -1;
>> +            goto out;
>> +        }
>> +        state->encoder = encoder;
>> +        state->type_id_off = encoder->type_id_off;
>> +        fn->priv = state;
>> +        encoder->saved_func_cnt++;
>> +        if (encoder->verbose)
>> +            printf("added local function '%s'%s\n", name,
>> +                   fn->proto.optimized_parms ?
>> +                   ", optimized-out params" : "");
>> +    }
>> +out:
>> +    pthread_mutex_unlock(&parent->saved_func_lock);
>> +    return ret;
>> +}
>> +



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux