Re: [PATCH dwarves 3/4] btf_encoder: cache all ELF section info

Alan Maguire <alan.maguire@xxxxxxxxxx> · Sun, 15 Sep 2024 12:26:05 +0100

On 13/09/2024 18:05, Stephen Brennan wrote:
> Alan Maguire <alan.maguire@xxxxxxxxxx> writes:
>> On 12/09/2024 20:08, Stephen Brennan wrote:
>>> To handle outputting all variables generally, we'll need to store more
>>> section data. Create a table of ELF sections so we can refer to all the
>>> cached data, not just the percpu section.
>>>
>>> Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx>
>>> ---
>>>  btf_encoder.c | 50 ++++++++++++++++++++++++++++++++++++--------------
>>>  1 file changed, 36 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/btf_encoder.c b/btf_encoder.c
>>> index 8a2d92e..e3384e5 100644
>>> --- a/btf_encoder.c
>>> +++ b/btf_encoder.c
>>> @@ -65,12 +65,20 @@ struct elf_function {
>>>  	struct btf_encoder_state state;
>>>  };
>>>  
>>> +#define MAX_ELF_SEC_CNT    128
>>> +
>>>  struct var_info {
>>>  	uint64_t    addr;
>>>  	const char *name;
>>>  	uint32_t    sz;
>>>  };
>>>  
>>> +struct elf_secinfo {
>>> +	uint64_t    addr;
>>> +	const char *name;
>>> +	uint64_t    sz;
>>> +};
>>> +
>>
>> One thing we've run into before is hardcoded limits get exceeded on
>> systems, and while it seems unlikely for 128 ELF sections, I think it'd
>> be better to make this dynamic just in case. Also for parallel mode we
>> have N encoders so it might same some memory (which we're pretty
>> profligate with in other areas). I'd suggest using reallocarray_grow()
>> as we do for functions. One thing we may want to tweak is that
>> reallocarray_grow() will grow by 1000 or 1.5 x the current size;
>> starting with 1000 ELF sections is a bit much.  We could perhaps
>> tweak reallocarray_grow() to have a min_growth parameter to control this
>> and set it to something smaller for ELF sections, what do you think?
> 
> I agree that the static allocation doesn't make a ton of sense. To be
> quite honest, I was lazy during the initial implementation and intended
> to come back around to make this dynamically allocated.
> 
> I think the answer is to simply use elf_getshdrnum and directly allocate
> the array to the size we need. I'll definitely do that in the next
> version.
> 

You're right; since we know in advance how many sections we need we can
just calloc() to get zeroed memory.

>>>  /*
>>>   * cu: cu being processed.
>>>   */
>>> @@ -95,13 +103,13 @@ struct btf_encoder {
>>>  			  is_rel,
>>>  			  gen_distilled_base;
>>>  	uint32_t	  array_index_id;
>>> +	struct elf_secinfo secinfo[MAX_ELF_SEC_CNT];
>>
>> ...so here we'd have a substructure like this
>>
>> 	struct {
>> 		int	allocated;
>> 		int	sec_cnt;
>> 		struct elf_secinfo *sections;
>> 	} elf_sections;
>>
>>> +	size_t             seccnt;
>>>  	struct {
>>>  		struct var_info *vars;
>>>  		int		var_cnt;
>>>  		int		allocated;
>>>  		uint32_t	shndx;
>>> -		uint64_t	base_addr;
>>> -		uint64_t	sec_sz;
>>>  	} percpu;
>>>  	struct {
>>>  		struct elf_function *entries;
>>> @@ -1849,7 +1857,7 @@ static int btf_encoder__collect_percpu_var(struct btf_encoder *encoder, GElf_Sym
>>>  	 * ET_EXEC file) we need to subtract the section address.
>>>  	 */
>>>  	if (!encoder->is_rel)
>>> -		addr -= encoder->percpu.base_addr;
>>> +		addr -= encoder->secinfo[encoder->percpu.shndx].addr;
>>>
>> We will need some of these mechanics later when we encode function
>> addresses; great to have this!
> 
> Ironically enough, this code (and especially, the is_rel field referred
> to here) gets deleted in patch 4. Maybe I should have left is_rel if you
> think we're going to need function addresses from the symbol table.
> 

No need to leave it in; if there's reason to delete it for your changes,
please go ahead. We can resurrect code for functions if needed, but no
need for you to worry about potential future changes like that!

> But patch 4 fully embraces simply using DWARF to get the variable
> addresses, and it stops using the ELF symbol table for variables. The
> DWARF addresses are always absolute. I couldn't really find a good
> reason for using the symbol table, except maybe as a decent way to
> filter out bogus variables that didn't actually become real data. For
> example, the x86_64 percpu region starts at address 0, and many bogus
> DWARF variables have address 0, so  making sure the variable matched the
> symbol at address 0 was a good way to filter bogus variables.
> 
> I didn't think it was a big deal that I removed this behavior, because
> with improved variable filtering, this code continues to produce nearly
> identical percpu DATASECs, with the only difference being that we now
> (correctly) include the fixed_percpu_data variable for x86_64, which was
> omitted before. So there's no regression in functionality.
> 

FWIW I think it's much better the way you're doing things; the situation
with functions is unfortunately messier, and we need to do comparisons
across function declarations to probe for inconsistencies.  I'm hoping
we can start to move in this direction with the function code too
though, especially if we add address info for functions too (which we're
hoping to).  Thanks!

Alan

> Stephen
> 
>>>  	if (encoder->percpu.var_cnt == encoder->percpu.allocated) {
>>>  		struct var_info *new;
>>> @@ -1923,6 +1931,7 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
>>>  	uint32_t core_id;
>>>  	struct tag *pos;
>>>  	int err = -1;
>>> +	struct elf_secinfo *pcpu_scn = &encoder->secinfo[encoder->percpu.shndx];
>>>  
>>>  	if (encoder->percpu.shndx == 0 || !encoder->symtab)
>>>  		return 0;
>>> @@ -1954,9 +1963,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder)
>>>  		 * always contains virtual symbol addresses, so subtract
>>>  		 * the section address unconditionally.
>>>  		 */
>>> -		if (addr < encoder->percpu.base_addr || addr >= encoder->percpu.base_addr + encoder->percpu.sec_sz)
>>> +		if (addr < pcpu_scn->addr || addr >= pcpu_scn->addr + pcpu_scn->sz)
>>>  			continue;
>>> -		addr -= encoder->percpu.base_addr;
>>> +		addr -= pcpu_scn->addr;
>>>  
>>>  		if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name))
>>>  			continue; /* not a per-CPU variable */
>>> @@ -2099,20 +2108,33 @@ struct btf_encoder *btf_encoder__new(struct cu *cu, const char *detached_filenam
>>>  			goto out;
>>>  		}
>>>  
>>> -		/* find percpu section's shndx */
>>> +		/* index the ELF sections for later lookup */
>>>  
>>>  		GElf_Shdr shdr;
>>> -		Elf_Scn *sec = elf_section_by_name(cu->elf, &shdr, PERCPU_SECTION, NULL);
>>> +		size_t shndx;
>>> +		if (elf_getshdrnum(cu->elf, &encoder->seccnt))
>>> +			goto out_delete;
>>> +		if (encoder->seccnt >= MAX_ELF_SEC_CNT) {
>>> +			fprintf(stderr, "%s: reached limit of ELF sections\n", __func__);
>>> +			goto out_delete;
>>> +		}
>>>  
>> as above we should just reallocarray_grow() more space.
>>
>>> -		if (!sec) {
>>> -			if (encoder->verbose)
>>> -				printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION);
>>> -		} else {
>>> -			encoder->percpu.shndx	  = elf_ndxscn(sec);
>>> -			encoder->percpu.base_addr = shdr.sh_addr;
>>> -			encoder->percpu.sec_sz	  = shdr.sh_size;
>>> +		for (shndx = 0; shndx < encoder->seccnt; shndx++) {
>>> +			const char *secname = NULL;
>>> +			Elf_Scn *sec = elf_section_by_idx(cu->elf, &shdr, shndx, &secname);
>>> +			if (!sec)
>>> +				goto out_delete;
>>> +			encoder->secinfo[shndx].addr = shdr.sh_addr;
>>> +			encoder->secinfo[shndx].sz = shdr.sh_size;
>>> +			encoder->secinfo[shndx].name = secname;
>>> +
>>> +			if (strcmp(secname, PERCPU_SECTION) == 0)
>>> +				encoder->percpu.shndx = shndx;
>>>  		}
>>>  
>>> +		if (!encoder->percpu.shndx && encoder->verbose)
>>> +			printf("%s: '%s' doesn't have '%s' section\n", __func__, cu->filename, PERCPU_SECTION);
>>> +
>>>  		if (btf_encoder__collect_symbols(encoder, !encoder->skip_encoding_vars))
>>>  			goto out_delete;
>>>