Re: [PATCH 3/3] btf_encoder: don't encode duplicate variables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/02/2025 00:49, Stephen Brennan wrote:
> btf__dedup() does not deduplicate variables, nor the contents of the
> DATASEC. This is reasonable, because there could very well be multiple
> variables of the same name & type from different CUs, but with different
> tags or DATASEC offsets. So it is up to us to ensure we don't provide
> duplicate variables to libbpf.
> 
> Unfortunately, there are some cases where variables are declared
> "__weak" in the source code, resulting in the same variable, with the
> same name & type, at the same offset, being encoded twice. The DWARF
> information does not encode any difference between the "__weak" symbol
> and the overriding symbol. And even if it did, we couldn't filter out
> "__weak" symbols, because they aren't always overridden. So, we must
> deduplicate them.
> 
> We can't guarantee that the types are identical prior to deduplication,
> but we can at least verify the offsets & names are identical. If this
> happens, simply skip the duplicate copy (or copies) of the variable.
>

hi Stephen, quick question; could we do without patch 2, and if we get
an offset match, then just look up the names of the offset-matching
variables in the code directly to see if they refer to the same
variable? that way we would only incur costs for the rare cases where we
get an  offset match. You could use the names__match() function to help
with that I think too. Would that be simpler maybe?

> Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx>
> ---
>  btf_encoder.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index d989fbd..29b2054 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -869,8 +869,36 @@ static int32_t btf_encoder__encode_stored_vars(struct btf_encoder *encoder,
>  					       struct saved_var *var, uint16_t nr)
>  {
>  	int i, id, annot_idx;
> +	int prev = 0;
>  
>  	for (i = 0; i < nr; i++) {
> +		if (i && var[i].offset == var[prev].offset) {
> +			/* The offset of this var is the same as the previous.
> +			 * This can happen when a variable is declared in one CU
> +			 * as "__weak", and then overridden in another CU. There
> +			 * is no indication in the DWARF which symbol is weak,
> +			 * so we need to deduplicate. If the duplicate is the
> +			 * same name, this is expected, and we can just drop the
> +			 * second instance. Otherwise, we need to raise an
> +			 * error.
> +			 *
> +			 * Ideally, we would like to verify that the types of
> +			 * the duplicated variables are the same type as well.
> +			 * However, there's no way to do this without
> +			 * deduplicating the BTF, but at that point, our type
> +			 * IDs won't be valid.
> +			 */
> +			if (strcmp(var[i].name, var[prev].name) == 0) {
> +				var[i].type = 0; /* Don't encode corresponding secinfo */
> +				continue;
> +			} else {
> +				fprintf(stderr,
> +					"error: encountered variable '%s' (type %x) at  offset 0x%x which duplicates variable '%s' at the same offset\n",
> +					var[i].name, var[i].type, var[i].offset, var[prev].name);
> +				return -1;
> +			}
> +		}
> +
>  		id = btf_encoder__add_var(encoder, var[i].type,
>  					  var[i].name, var[i].linkage);
>  		if (id < 0) {
> @@ -894,6 +922,7 @@ static int32_t btf_encoder__encode_stored_vars(struct btf_encoder *encoder,
>  		free(var->annots);
>  		var->annots = NULL;
>  
> +		prev = i;
>  		var[i].type = id;
>  	}
>  	return 0;





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux