On Thu, Jan 20, 2022 at 08:24:31PM +0100, Luca Weiss wrote: > Hi, > > On Dienstag, 18. Jänner 2022 11:08:22 CET David Gibson wrote: > > On Wed, Jan 05, 2022 at 04:48:31PM -0600, Rob Herring wrote: > > > On Sat, Dec 25, 2021 at 7:26 AM Luca Weiss <luca@xxxxxxxxx> wrote: > > > > Add a new method for decoding a string list property, useful for e.g. > > > > the "reg-names" property. > > > > > > > > Also add a test for the new method. > > > > > > > > Signed-off-by: Luca Weiss <luca@xxxxxxxxx> > > > > --- > > > > > > > > pylibfdt/libfdt.i | 7 +++++++ > > > > tests/pylibfdt_tests.py | 8 ++++++++ > > > > 2 files changed, 15 insertions(+) > > > > > > > > diff --git a/pylibfdt/libfdt.i b/pylibfdt/libfdt.i > > > > index 9ccc57b..c81b504 100644 > > > > --- a/pylibfdt/libfdt.i > > > > +++ b/pylibfdt/libfdt.i > > > > > > > > @@ -724,6 +724,13 @@ class Property(bytearray): > > > > raise ValueError('Property contains embedded nul > > > > characters') > > > > > > > > return self[:-1].decode('utf-8') > > > > > > > > + def as_stringlist(self): > > > > + """Unicode is supported by decoding from UTF-8""" > > > > + if self[-1] != 0: > > > > + raise ValueError('Property lacks nul termination') > > > > + parts = self[:-1].split(b'\x00') > > > > + return list(map(lambda x: x.decode('utf-8'), parts)) > > > > > > Doesn't this result in multiple decode() calls when a single one would > > > work: > > > > > > return data[:-1].decode(encoding='ascii').split('\0') > > > > Uh.. I guess? I feel like the split-then-decode makes more logical > > sense, since it's splitting a bytestring, then decoding the pieces as > > utf-8 strings. That makes sense to me given that raw properties are > > bytestrings and can included multiple different datatypes and > > encodings in general. > > > > In this specific case, decode-then-split would be fine as well, since > > \u00000 works as a separator unambiguously, but it still seems > > conceptually muddier to me. > > The reason I made it this way was mostly because I didn't know you could have > null bytes present in str.decode, I just remember horrible UnicodeDecodeErrors > on invalid input from other projects. > If wanted I can make a patch changing to just one str.decode call as yes, it's > surely more efficient than doing multiple. But it's also not like there will be > 100 parts of this property that's being decoded in a performance critical > application so I think it's also okay like that. I agree, and I tend to prefer it the way it is, unless there's a compelling reason to change. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature