Re: Size growth?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Thu, Oct 22, 2020 at 03:00:13PM +1100, David Gibson wrote:
> On Wed, Oct 21, 2020 at 06:49:14PM -0400, Tom Rini wrote:
> > On Tue, Oct 20, 2020 at 01:09:07PM +1100, David Gibson wrote:
> > > On Mon, Oct 19, 2020 at 08:37:00AM -0400, Tom Rini wrote:
> > > > On Mon, Oct 19, 2020 at 12:42:13PM +1100, David Gibson wrote:
> > > > > On Fri, Oct 16, 2020 at 03:46:41PM -0600, Simon Glass wrote:
> > > > > > Hi Tom,
> > > > > > 
> > > > > > On Fri, 16 Oct 2020 at 13:36, Tom Rini <trini@xxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > Hey all,
> > > > > > >
> > > > > > > With my U-Boot hat on, I'm trying to move us up from 84e414b0b5bc in dtc
> > > > > > > to cbca977ea121.  I'm seeing some rather worrying size increases
> > > > > > > however.  For example, on an aarch64 board such as leez-rk3399 we have:
> > > > > > >      leez-rk3399    : all +2696 spl/u-boot-spl:all +1116 spl/u-boot-spl:text +1116 text +2696
> > > > > > >         u-boot: add: 0/0, grow: 42/0 bytes: 2696/0 (2696)
> > > > > > >           function                                   old     new   delta
> > > > > > >           do_fdt                                    3992    4308    +316
> > > > > > >           fdt_check_header                           276     492    +216
> > > > > > >           fdt_add_property_                          388     556    +168
> > > > > > >           fdt_ro_probe_                              128     252    +124
> > > > > > >           fdt_packblocks_                            176     296    +120
> > > > > > >           fdt_open_into                              392     512    +120
> > > > > > >           fdt_blocks_misordered_                      96     216    +120
> > > > > > >           fdt_get_mem_rsv                            108     220    +112
> > > > > > >           fdt_offset_ptr                             104     208    +104
> > > > > > >           fdt_get_string                             288     388    +100
> > > > > > >           fdt_splice_                                148     228     +80
> > > > > > >           fdt_valid                                  204     276     +72
> > > > > > >           fdt_getprop_by_offset                      184     256     +72
> > > > > > >           fdt_splice_mem_rsv_                         96     152     +56
> > > > > > >           fdt_num_mem_rsv                             60     116     +56
> > > > > > >           fdt_splice_struct_                          96     144     +48
> > > > > > >           fdt_shrink_to_minimum                      220     268     +48
> > > > > > >           fdt_rw_probe_                              108     156     +48
> > > > > > >           fdt_mem_rsv                                 60     108     +48
> > > > > > >           fdt_getprop_namelen                        100     148     +48
> > > > > > >           fdt_get_property_by_offset_                100     148     +48
> > > > > > >           fdt_get_name                               164     212     +48
> > > > > > >           efi_install_fdt                            964    1012     +48
> > > > > > >           boot_get_fdt                               888     936     +48
> > > > > > >           fdt_move                                    80     116     +36
> > > > > > >           reserve_fdt                                 72      96     +24
> > > > > > >           reloc_fdt                                   76     100     +24
> > > > > > >           image_setup_libfdt                         284     308     +24
> > > > > > >           fit_image_load                            1608    1632     +24
> > > > > > >           fit_image_get_data_and_size                172     196     +24
> > > > > > >           fit_get_end                                 16      40     +24
> > > > > > >           fdt_next_tag                               256     280     +24
> > > > > > >           fdt_header_size                             12      36     +24
> > > > > > >           fdt_get_property_namelen_                  212     236     +24
> > > > > > >           fdt_get_property_namelen                    44      68     +24
> > > > > > >           fdt_get_phandle                            120     144     +24
> > > > > > >           fdt_del_node                                96     120     +24
> > > > > > >           fdt_del_mem_rsv                            112     136     +24
> > > > > > >           fdt_add_subnode_namelen                    284     308     +24
> > > > > > >           fdt_add_mem_rsv                            124     148     +24
> > > > > > >           common_diskboot                            680     704     +24
> > > > > > >           fdt_next_subnode                            80      88      +8
> > > > > > >         spl-u-boot-spl: add: 1/-1, grow: 25/-2 bytes: 1176/-60 (1116)
> > > > > > >           function                                   old     new   delta
> > > > > > >           fdt_get_mem_rsv                             52     236    +184
> > > > > > >           fdt_add_property_                          340     484    +144
> > > > > > >           fdt_get_name                                68     152     +84
> > > > > > >           fdt_splice_                                148     228     +80
> > > > > > >           fdt_num_mem_rsv                             64     128     +64
> > > > > > >           fdt_splice_mem_rsv_                         96     152     +56
> > > > > > >           fdt_offset_ptr                              52     104     +52
> > > > > > >           fdt_splice_struct_                          96     144     +48
> > > > > > >           fdt_shrink_to_minimum                      220     268     +48
> > > > > > >           fdt_get_property_by_offset_                 36      84     +48
> > > > > > >           fdt_ro_probe_                                -      36     +36
> > > > > > >           fdt_subnode_offset_namelen                 200     232     +32
> > > > > > >           spl_load_simple_fit                        848     872     +24
> > > > > > >           spl_fit_append_fdt                         196     220     +24
> > > > > > >           fdt_getprop_by_offset                       80     104     +24
> > > > > > >           fdt_get_string                              64      88     +24
> > > > > > >           fdt_get_property_namelen_                  200     224     +24
> > > > > > >           fdt_get_phandle                            120     144     +24
> > > > > > >           fdt_del_mem_rsv                            104     128     +24
> > > > > > >           fdt_check_header                            28      52     +24
> > > > > > >           fdt_add_subnode_namelen                    260     284     +24
> > > > > > >           fdt_add_mem_rsv                            112     136     +24
> > > > > > >           fdt_next_tag                               192     212     +20
> > > > > > >           fdt_path_offset_namelen                    240     256     +16
> > > > > > >           fdt_supernode_atdepth_offset               160     172     +12
> > > > > > >           fdt_node_offset_by_phandle                 120     132     +12
> > > > > > >           fdt_mem_rsv                                 20       -     -20
> > > > > > >           fdt_check_prop_offset_                      64      44     -20
> > > > > > >           fdt_check_node_offset_                      64      44     -20
> > > > > > >
> > > > > > > And note that for the SPL case we're already setting ASSUME_MASK to
> > > > > > > 0xff so there's maximum savings already being done there.  Does anyone
> > > > > > > have ideas on where / how to further tweak code size?  Thanks!
> > > > > > 
> > > > > > +David Gibson
> > > > > > 
> > > > > > I suspect there are more checks that need to be made conditional.
> > > > > 
> > > > > Seems likely.
> > > > 
> > > > OK.  Does that mean you're going to take a look?
> > > 
> > > No, sorry, my time for dtc is very limited.
> > 
> > OK, fair point.
> > 
> > > > > Though, as I've opined before, from what I understand the SPL is *so*
> > > > > restricted an environment, I'm not really convinced DTs are the right
> > > > > tool for the job there.
> > > > 
> > > > SPL is space constrained, which is why we mask out all of the safety
> > > > checks.  But we still generally have enough memory that this is fine.
> > > > This specific board has 256KiB for SPL, for example.  But I'm not just
> > > > concerned about 1KiB of growth when everything is masked off, I'm also
> > > > concerned about 2KiB of growth over a changelog that doesn't read like
> > > > it added a bunch of stuff that should cause everything to grow,
> > > > either.
> > > 
> > > Yeah, that's a good point.  It's certainly not obvious to me what
> > > would have caused the growth.  I think we'll need a revision by
> > > revision test to see where it was added.
> > 
> > So, with a bit of playing around, I've used buildman to give us that,
> > for the leez-rk3399 platform.  The full results are at
> > https://gist.github.com/trini/e42cc848d38bfc77e0124d3ace86d95d and note
> > that build #6, "scripts/dtc: Update to upstream version
> > v1.4.6-22-g24b1f3f064d4" is the first interesting one (#5 was a resync
> > entirely to dtc's 84e414b0b5bc rather than strictly what U-Boot has).
> 
> Right.. I was meaning stepping through the dtc upstream revisions, not
> just the uboot revisions, so you get a fine grained idea of where the
> increase is coming from.

Yes, that's what that link is.  I imported every revision from point A
to point B, and built U-Boot for it, and checked the sizes, since we
have tooling for that.  It's possible the bloat-o-meter script in the
kernel could be used, but I'm not sure what the right target(s) in dtc
would be, to get any useful information out.

> > But what does all of this _mean_ ?  I kinda think I have an answer now.
> > One of the things that sticks out is 6dcb8ba408ec adds a lot and
> > 11738cf01f15 reduces it just a little.
> 
> Ah, that's a tricky one.  If we don't handle unaligned accesses we
> instead get intermittent bug reports where it just crashes.

We really need to talk about that then.  There was a problem of people
turning off the sanity check for making sure the entire device tree was
aligned and then having everything crash.

> I suppose we could add an ASSUME_ALIGNED_ACCESS flag, and it will just
> break for either an unaligned dtb (unlikely) or if you attempt to load
> an unaligned value from a property (more likely, but don't add the
> flag if you're not sure you don't need it).

So long as it's abstracted in such a way that we don't grow the size of
everything again, yes, that is the right way forward I think.  These
tests still introduce a big boot-time slowdown as well.

> > With a re-revert again locally, there's just a few size
> > growths now to look in to on the SPL side:
> >      leez-rk3399    : all +8 spl/u-boot-spl:all +180 spl/u-boot-spl:text +180 text +8
> >         u-boot: add: 0/0, grow: 3/-1 bytes: 28/-20 (8)
> >           function                                   old     new   delta
> >           fdt_move                                    80      92     +12
> >           fdt_splice_                                148     156      +8
> >           fdt_offset_ptr                             104     112      +8
> >           fdt_get_string                             288     268     -20
> >         spl-u-boot-spl: add: 1/0, grow: 9/-2 bytes: 220/-40 (180)
> >           function                                   old     new   delta
> >           fdt_get_name                                68     128     +60
> >           fdt_get_mem_rsv                             52      96     +44
> >           fdt_subnode_offset_namelen                 200     232     +32
> >           fdt_next_tag                               192     212     +20
> >           fdt_path_offset_namelen                    240     256     +16
> >           fdt_supernode_atdepth_offset               160     172     +12
> >           fdt_ro_probe_                                -      12     +12
> >           fdt_node_offset_by_phandle                 120     132     +12
> >           fdt_splice_                                148     156      +8
> >           fdt_offset_ptr                              52      56      +4
> >           fdt_check_prop_offset_                      64      44     -20
> >           fdt_check_node_offset_                      64      44     -20
> > 
> > Of those, I'm not 100% sure.  That might well end up being Simon's
> > suggestion of a few places needing a check they don't have, I'm not
> > sure.
> 
> Maybe.  Really need to know which dtc/libfdt revision makes the
> changes to decipher that.

Yes, there's a few suggestive dtc revisions that change the size of
those functions in the full gist.  I think some of the satisfy Coverity
changes had small impact, but I didn't look for fdt_get_name /
fdt_get_mem_rsv.

-- 
Tom

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Device Tree]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux