On Tue, Jan 28, 2020 at 09:51:43AM -0500, Tom Rini wrote: > On Tue, Jan 28, 2020 at 08:08:53AM -0600, Rob Herring wrote: > > On Tue, Jan 28, 2020 at 7:43 AM Tom Rini <trini@xxxxxxxxxxxx> wrote: > > > > > > On Tue, Jan 28, 2020 at 07:59:18PM +1100, David Gibson wrote: > > > > On Mon, Jan 27, 2020 at 10:04:34AM -0500, Tom Rini wrote: > > > > > On Mon, Jan 27, 2020 at 02:23:51PM +1100, David Gibson wrote: > > > > > > On Thu, Jan 23, 2020 at 07:16:50AM -0500, Tom Rini wrote: > > > > > > > On Thu, Jan 23, 2020 at 08:14:40PM +1100, David Gibson wrote: > > > > > > > > On Fri, Jan 17, 2020 at 10:31:06AM -0500, Tom Rini wrote: > > > > > > > > > 6dcb8ba4 "libfdt: Add helpers for accessing unaligned words" introduced > > > > > > > > > changes to support unaligned reads for ARM platforms and 11738cf01f15 > > > > > > > > > "libfdt: Don't use memcpy to handle unaligned reads on ARM" improved the > > > > > > > > > performance of these helpers. > > > > > > > > > > > > > > > > > > Ultimately however, these helpers should not exist. Unaligned access > > > > > > > > > only occurs when images are forced out of alignment by the user. This > > > > > > > > > unalignment is not supported and introduces problems later on as other > > > > > > > > > parts of the system image are unaligned and they too require alignment. > > > > > > > > > > > > > > > > > > Revert both of these changes. > > > > > > > > > > > > > > > > > > Signed-off-by: Tom Rini <trini@xxxxxxxxxxxx> > > > > > > > > > --- > > > > > > > > > By way of a little more explanation, looking at the archives it seems > > > > > > > > > that the initial bug reporter said that they had a platform that was > > > > > > > > > using U-Boot and had the "fdt_high=0xffffffff" set in the environment. > > > > > > > > > What that does is to tell U-Boot to not do any of the sanity checks and > > > > > > > > > relocation to ensure alignment that it would normally do because the > > > > > > > > > user knows best. This later came up on the U-Boot list as once the DTB > > > > > > > > > was loaded, Linux is unhappy because it demands correct alignment. > > > > > > > > > > > > > > > > > > I only realized libfdt had introduced changes here when it was reported > > > > > > > > > that boot time had gotten much slower once we merged this change in. It > > > > > > > > > would be best to just drop it. > > > > > > > > > > > > > > > > Hmm. I'm not sure about this. The commit message makes a case for > > > > > > > > why the unaligned handling isn't necessary, but not a case for why > > > > > > > > it's bad. Even if handling an unaligned tree isn't a common case, > > > > > > > > isn't it better to be able to than not? > > > > > > > > > > > > > > > > I gather from the previous discussion that there's a significant > > > > > > > > performance impact, but that rationale needs to go into the commit > > > > > > > > message for posterity. > > > > > > > > > > > > > > I wanted to emphasize that the code simply isn't ever needed, not that > > > > > > > it's a performance problem. A performance problem implies that we would > > > > > > > keep this, if it was fast enough. That's why people noticed it (it > > > > > > > slows things down to an unusable level). But it's functionally wrong. > > > > > > > > > > > > > > > [snip] > > > > > > > > > diff --git a/libfdt/libfdt.h b/libfdt/libfdt.h > > > > > > > > > index fc4c4962a01c..d4ebe915cf46 100644 > > > > > > > > > --- a/libfdt/libfdt.h > > > > > > > > > +++ b/libfdt/libfdt.h > > > > > > > > > @@ -117,23 +117,6 @@ static inline void *fdt_offset_ptr_w(void *fdt, int offset, int checklen) > > > > > > > > > > > > > > > > > > uint32_t fdt_next_tag(const void *fdt, int offset, int *nextoffset); > > > > > > > > > > > > > > > > > > -/* > > > > > > > > > - * Alignment helpers: > > > > > > > > > - * These helpers access words from a device tree blob. They're > > > > > > > > > - * built to work even with unaligned pointers on platforms (ike > > > > > > > > > - * ARM) that don't like unaligned loads and stores > > > > > > > > > - */ > > > > > > > > > - > > > > > > > > > -static inline uint32_t fdt32_ld(const fdt32_t *p) > > > > > > > > > -{ > > > > > > > > > - const uint8_t *bp = (const uint8_t *)p; > > > > > > > > > - > > > > > > > > > - return ((uint32_t)bp[0] << 24) > > > > > > > > > - | ((uint32_t)bp[1] << 16) > > > > > > > > > - | ((uint32_t)bp[2] << 8) > > > > > > > > > - | bp[3]; > > > > > > > > > -} > > > > > > > > > > > > > > > > In particular, I definitely think removing the helpers entirely is a > > > > > > > > no go. They're now part of the published interface of the library. > > > > > > > > > > > > > > Perhaps "mistakes were made" ? I don't think we need to worry about > > > > > > > removing an interface here as projects are avoiding upgrading libfdt > > > > > > > now (TF-A, formerly ATF) or reverting the change (U-Boot). > > > > > > > > > > > > > > > Even if they're not used for reading the internal tags, they can be > > > > > > > > used to load integers from within particular properties. Those are > > > > > > > > frequently unaligned, since properties generally have packed > > > > > > > > representations. > > > > > > > > > > > > > > I don't see the relevance. Go back to the initial problem report. It's > > > > > > > not "I have a new unique platform I'm using libfdt on and I have > > > > > > > problems". It's "I keep jabbing myself with a rusty nail and now I have > > > > > > > problems". > > > > > > > > > > > > The initial report isn't the only relevant thing here. Although it > > > > > > prompted the change, it wasn't the only consideration in making it. > > > > > > > > > > > > There's also two separate questions here: > > > > > > 1) Do we want byteswapping integer load helpers? > > > > > > 2) Should those handle unaligned accesses? > > > > > > > > > > > > In working out how to address the (as it turns out, non existent) > > > > > > problem, I realized an abstraction for loading big-endian integers > > > > > > from the blob would be a useful thing in its own right. I also > > > > > > realized that it's a useful thing not just inside the libfdt code, but > > > > > > as an external interface. Callers have always needed to interpret the > > > > > > contents of individual dt properties, and loading integers from them > > > > > > is often a part of that. > > > > > > > > > > > > I know of people using those fdtXX_ld() helpers right now, so I'm not > > > > > > going to take them away. > > > > > > > > > > > > For the case of external users we absolutely do need to handle > > > > > > unaligned accesses. There are a number of existing bindings that mix > > > > > > strings and integers in packed format, in a single encoded property. > > > > > > So regardless of the alignment of the whole property, we can easily > > > > > > get unaligned integers in there, and I don't want to expose multiple > > > > > > different helpers for different cases. > > > > > > > > > > > > Now, we don't *have* to use the same helpers for internal use. We > > > > > > could open code the internal loads, or use a special aligned-only > > > > > > version inside. But using the existing external helpers is the > > > > > > obvious and simple choice. > > > > > > > > > > > > So, we're back to: I need a case for changing this now, not just a > > > > > > case for claiming it wasn't needed in the first place. > > > > > > > > > > For U-Boot, I'm just going to revert this part of the changes. I would > > > > > > > > That seems reasonable for the interim. > > > > > > > > > suggest that you look in to some way to fix the "fast path" here to go > > > > > back to the previous way of working so that other project can continue > > > > > to use libfdt as well and when callers need to access these helpers and > > > > > are not otherwise in the fast path can do so. > > > > > > > > > > You're adding visible boot time delay with things the way they exist > > > > > today. That's not OK. > > > > > > > > That's a fair point, but you need to make that case in the commit > > > > message and merge request, not just expect me to find and collate the > > > > arguments from other threads. > > > > > > If you want me to leave the helpers alone but otherwise revert things, I > > > can do a v2 and expand the commit message. And perhaps I'm too nice > > > sometimes then but I do pickup and tweak things that are close enough to > > > what I want and reword as needed for clarity. > > > > Why not just fix the helpers for the aligned case and be done with it: > > > > static inline uint32_t fdt32_ld(const fdt32_t *p) > > { > > const uint8_t *bp = (const uint8_t *)p; > > > > + if (!((unsigned long)p & 0x3)) > > + return fdt32_to_cpu(*p); > > + > > return ((uint32_t)bp[0] << 24) > > | ((uint32_t)bp[1] << 16) > > | ((uint32_t)bp[2] << 8) > > | bp[3]; > > } > > I don't know if that was one of the ideas David tried when the problem > was reported initially. It would be good to know what the > size/performance impact of that is. But still, fast path needs to be > fast. I haven't tried this - I don't have easy access to systems to measure the performance impact. Or, rather, I can get access to systems, but I don't really have the bandwidth to prepare a performance testing setup. I think this is worth a shot. I don't know much the conditional branch will impact the fast path. If it's significantly better than what we have now, it might be a good interim step, even if it's not the final word. > > > I still believe you have things wrong. There's not an unaligned access > > > problem that libfdt needs to care about. ARM doesn't need help handling > > > unaligned accesses. The only problem that's been reported is from when > > > a user got themselves so far off in the weeds that nothing else matters. > > > > I think while the vast majority of DTBs don't have anything that would > > cause unaligned accesses, that's not guaranteed by the FDT format. > > libfdt needs to handle the worst case. > > It also needs to be useful. Please note that two of the projects using > libfdt are holding back upgrading (TF-A, > https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/log/lib/libfdt/fdt.c) > or reverting this change (U-Boot) because this is increasing boot time > in visible ways. It's not "you added 100ms". It's "you added 1 second > or more". > > > What about ARMv5 and v4 which don't universally support unaligned > > accesses or any other architecture. Do all mips, openrisc, riscv, arc, > > microblaze, xtenza, etc. support unaligned accesses? > > It's been long enough since the last time I was in a what-about > discussion over alignment issues that no, I don't recall just how much > special casing you need to put in the common paths to handle the > uncommon case. My general recollection is that no, we don't need to go > all out on this as the cases where unaligned access is fatal (rather > than just bad performance in that rare case) is small. It's so small > that the problem wasn't found here because someone did that, it's > because someone (inadvertently) turned off all the safety and sanity > checks and then saw not safe and not sane results. > > Thanks! > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature