On Wed, Jun 10, 2020 at 09:20:29AM +0800, Shi Lei wrote: > This tool is used to generate parsexml/formatbuf functions for structs. > It is based on libclang and its python-binding. > Some directives (such as genparse, xmlattr, etc.) need to be added on > the declarations of structs to direct the tool. > > Signed-off-by: Shi Lei <shi_lei@xxxxxxxxxxxxxx> > --- > build-aux/generator/directive.py | 839 +++++++++++++++++++++++++++++++ > build-aux/generator/go | 14 + > build-aux/generator/main.py | 416 +++++++++++++++ > build-aux/generator/utils.py | 100 ++++ FWIW, we've tried to standardize on scripts/ directory for any helper scripts used during the build process. Calling it "generator" is probably too broad, as we've lots of diffferent generators. I think this would would be a good fit as scripts/xmlgen/ > po/POTFILES.in | 1 + > 5 files changed, 1370 insertions(+) > create mode 100644 build-aux/generator/directive.py > create mode 100755 build-aux/generator/go > create mode 100755 build-aux/generator/main.py > create mode 100644 build-aux/generator/utils.py > +T_CLEAR_FUNC_IMPL = ''' > +void > +${typename}Clear(${typename}Ptr def) > +{ > + if (!def) > + return; > + > + ${body} > +} > +''' > + > +T_CLEAR_FUNC_DECL = ''' > +void > +${typename}Clear(${typename}Ptr def); In general I think we have a preference for ${typename}Free functions, rather than ${typename}Clear. A Clear() func is needed when a struct is often stack allocated, or when the method allocating & free'ing the struct is different from the method clearing it. The latter case is generally considered bad practice - it is safer and easier to verify code when clearing and freeing is done at the same same. > +def clearMember(member): > + mtype = TypeTable().get(member['type']) > + > + refname = 'def->%s' % member['name'] > + if member.get('array'): > + refname += '[i]' > + > + code = '' > + if mtype['meta'] == 'Struct': > + if member['pointer'] and not mtype.get('external'): > + code = '%sClear(%s);' % (mtype['name'], refname) > + code += '\nVIR_FREE(%s);' % refname We should use g_free() for any new code. > + else: > + code = '%sClear(&%s);' % (mtype['name'], refname) > + elif mtype['name'] == 'String': > + code = 'VIR_FREE(%s);' % refname > + elif mtype['name'] in ['Chars', 'UChars']: > + code = 'memset(%s, 0, sizeof(%s));' % (refname, refname) > + elif not member.get('array'): > + code = '%s = 0;' % refname > + > + if member.get('specified'): > + assert not member.get('array'), "'specified' can't come with 'array'." > + code += '\n%s_specified = false;' % refname > + > + if member.get('array') and code: > + counter = counterName(member['name']) > + if singleline(code): > + code = render(T_LOOP_SINGLE, counter=counter, body=code) > + else: > + code = render(T_LOOP_MULTI, > + counter=counter, body=indent(code, 2)) > + code += '\nVIR_FREE(def->%s);' % member['name'] > + code += '\ndef->%s = 0;' % counter > + > + return code > + > + > +T_CLEAR_NAMESPACE = ''' > +if (def->namespaceData && def->ns.free) > + (def->ns.free)(def->namespaceData); > +''' > + > + > +def makeClearFunc(writer, atype): > + if 'genparse' not in atype: > + return > + > + blocks = BlockAssembler() > + for member in atype['members']: > + blocks.append(clearMember(member)) > + > + if 'namespace' in atype: > + blocks.append(T_CLEAR_NAMESPACE.strip()) > + > + body = indent(blocks.output('\n\n'), 1) > + > + impl = render(T_CLEAR_FUNC_IMPL, typename=atype['name'], body=body) > + writer.write(atype, 'clearfunc', '.c', impl) > + > + decl = render(T_CLEAR_FUNC_DECL, typename=atype['name']) > + writer.write(atype, 'clearfunc', '.h', decl) Bearing in mind the above, I'd say we should create Free() funcs, not Clear() funcs here. > + > + > +# > +# Templates for parsing member block > +# > +T_READ_ATTR_BY_PROP = '${mdvar} = virXMLPropString(node, "${tname}");' > +T_READ_ELEM_BY_PROP = '${mdvar} = virXMLChildNode(node, "${tname}");' > +T_READ_ELEM_CONTENT = '${mdvar} = virXMLChildNodeContent(node, "${tname}");' > + > +T_PARSE_MEMBER_MORE = ''' > +${number} = virXMLChildNodeSet(node, "${tname}", &nodes); > +if (${number} > 0) { > + size_t i; > + > + if (VIR_ALLOC_N(def->${name}, ${number}) < 0) > + goto error; > + > + for (i = 0; i < ${number}; i++) { > + xmlNodePtr tnode = nodes[i]; > + ${item} > + } > + def->${counter} = ${number}; > + VIR_FREE(nodes); > +} else if (${number} < 0) { > + virReportError(VIR_ERR_XML_ERROR, "%s", > + _("Invalid ${tname} element found.")); For error messages we should avoid using python string substitution. Instead we want todo virReportError(VIR_ERR_XML_ERROR, _("Invalid %s element found."). "${tname}"); This means that when we repeat this code with many different values of ${tname}, we're only getting 1 string for the translation team to deal with. The same point for other virReportError error callsin this file, I won't mention them all individually. > +T_ALLOC_MEMORY = ''' > +if (VIR_ALLOC(def->${name}) < 0) > + goto error; > +''' We prefer g_new0 for new code now. As a general rule for any code, if there is a GLib API that works then we should use that, instead of a libvirt API. > +T_PARSE_FUNC_DECL = ''' > +int > +${funcname}(${args}); > +''' > + > +T_PARSE_FUNC_IMPL = ''' > +int > +${funcname}(${args}) > +{ > + ${declare_vars} > + VIR_USED(instname); > + VIR_USED(opaque); There should not be any need for this macro. Instead change the function declaration to add G_GNUC_UNUSED for "instname" and "opaque". This annotation is harmless if the variable really is used. > + > + if (!def) > + goto error; > + > + ${body} > + > + return 0; > + > + error: > + ${cleanup_vars} > + ${typename}Clear(def); IIUC, the caller of this method is responsible for allocating 'def', and so the caller should be responsible for running ${typename}Free(def). IOW, we should not need to call Clear for this. > + return -1; > +} > +def _handleTmpVars(tmpvars): > + heads, tails = [], [] > + tmpvars = dedup(tmpvars) > + for var in tmpvars: > + if var == 'nodes': > + heads.append('xmlNodePtr *nodes = NULL;') > + tails.append('VIR_FREE(nodes);') > + elif var.endswith('Str'): > + heads.append('g_autofree char *%s = NULL;' % var) > + elif var.endswith('Node'): > + heads.append('xmlNodePtr %s = NULL;' % var) > + else: > + assert var.endswith('Nodes') and var.startswith('n') > + heads.append('int %s = 0;' % var) > + > + return '\n'.join(heads), '\n'.join(tails) One thing I noticed in the generated code is that we;re duplicating the strings twice: Consider this snippet of generated code: g_autofree char *nameStr = NULL; nameStr = virXMLPropString(node, "name"); if (nameStr == NULL) { virReportError(VIR_ERR_XML_ERROR, _("Missing 'name' setting in '%s'"), instname); goto error; } def->name = g_strdup(nameStr); virXMLPropString already duplicates the string, and then we g_strdup again. So we should just get rid of the g_autofree and directly assign def->name = nameStr. > diff --git a/build-aux/generator/go b/build-aux/generator/go > new file mode 100755 > index 0000000..9e30e08 > --- /dev/null > +++ b/build-aux/generator/go > @@ -0,0 +1,14 @@ > +# This is a command-line tool > + > +libclang_line=`ldconfig -p | grep libclang` This needs to be libclang.so, otherwise it can match on the wrong library: $ ldconfig -p | grep libclang libclang.so.10 (libc6,x86-64) => /lib64/libclang.so.10 libclang.so (libc6,x86-64) => /lib64/libclang.so libclang-cpp.so.10 (libc6,x86-64) => /lib64/libclang-cpp.so.10 libclang-cpp.so (libc6,x86-64) => /lib64/libclang-cpp.so > +export libclang_path=`expr "$libclang_line" : '.* => \(.*\)$'` > +if test -z "$libclang_path"; then > + echo "libclang is required by libvirt\n" > + exit -1 > +fi > diff --git a/build-aux/generator/main.py b/build-aux/generator/main.py > new file mode 100755 > index 0000000..78a90f1 > --- /dev/null > +++ b/build-aux/generator/main.py > @@ -0,0 +1,416 @@ > +#!/usr/bin/env python3 > +# > +# Copyright (C) 2020 Shandong Massclouds Co.,Ltd. > +# > +# This library is free software; you can redistribute it and/or > +# modify it under the terms of the GNU Lesser General Public > +# License as published by the Free Software Foundation; either > +# version 2.1 of the License, or (at your option) any later version. > +# > +# This library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# You should have received a copy of the GNU Lesser General Public > +# License along with this library. If not, see > +# <http://www.gnu.org/licenses/>. > +# > +if __name__ == "__main__": > + > + libclang_path = os.environ.get('libclang_path') > + assert libclang_path, 'No libclang library.' > + Config.set_library_file(libclang_path) I'm wondering if we really need this at all ? AFAICT, the libclang python library works fine without it, so this is only required if trying to use a non-standard libclang, which I think ought to be possible already by setting LD_LIBRARY_PATH Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|