On Wed, Jan 05, 2022 at 04:48:31PM -0600, Rob Herring wrote: > On Sat, Dec 25, 2021 at 7:26 AM Luca Weiss <luca@xxxxxxxxx> wrote: > > > > Add a new method for decoding a string list property, useful for e.g. > > the "reg-names" property. > > > > Also add a test for the new method. > > > > Signed-off-by: Luca Weiss <luca@xxxxxxxxx> > > --- > > pylibfdt/libfdt.i | 7 +++++++ > > tests/pylibfdt_tests.py | 8 ++++++++ > > 2 files changed, 15 insertions(+) > > > > diff --git a/pylibfdt/libfdt.i b/pylibfdt/libfdt.i > > index 9ccc57b..c81b504 100644 > > --- a/pylibfdt/libfdt.i > > +++ b/pylibfdt/libfdt.i > > @@ -724,6 +724,13 @@ class Property(bytearray): > > raise ValueError('Property contains embedded nul characters') > > return self[:-1].decode('utf-8') > > > > + def as_stringlist(self): > > + """Unicode is supported by decoding from UTF-8""" > > + if self[-1] != 0: > > + raise ValueError('Property lacks nul termination') > > + parts = self[:-1].split(b'\x00') > > + return list(map(lambda x: x.decode('utf-8'), parts)) > > Doesn't this result in multiple decode() calls when a single one would work: > > return data[:-1].decode(encoding='ascii').split('\0') Uh.. I guess? I feel like the split-then-decode makes more logical sense, since it's splitting a bytestring, then decoding the pieces as utf-8 strings. That makes sense to me given that raw properties are bytestrings and can included multiple different datatypes and encodings in general. In this specific case, decode-then-split would be fine as well, since \u00000 works as a separator unambiguously, but it still seems conceptually muddier to me. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Attachment:
signature.asc
Description: PGP signature