Re: [PATCH v3 03/22] build-aux: rewrite po file minimizer in Python

Daniel P. Berrangé <berrange@xxxxxxxxxx> · Thu, 26 Sep 2019 16:38:49 +0100

On Thu, Sep 26, 2019 at 05:34:49PM +0200, Ján Tomko wrote:
> On Thu, Sep 26, 2019 at 02:16:04PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Sep 26, 2019 at 12:39:39PM +0200, Erik Skultety wrote:
> > > On Tue, Sep 24, 2019 at 03:58:44PM +0100, Daniel P. Berrangé wrote:
> > > question 1) what's the benefit of compiling a regex and using it only once? Btw
> > > python does cache every pattern passed to re.match (and friends) so compilation
> > > IMO hardly ever makes sense unless you're doing 1000s of searches for the same
> 
> Some of the scripts here are run on the whole libvirt codebase so that
> is the case here. For example just removing the pre-compilation of
> regexes for comments from the spacing check script bumped the execution
> time from 6.5s to 7.4s
> 
> Sadly, the one script where pre-compilation matters the most is the one
> where separating them puts them far away from the usage to not fit on
> one screen.

I could do a little custom function that caches all regexes

  recache = {}

  def research(regex, line):
    global recache
    if regex not in recache:
      recache[regex] = re.compile(regex)
    return recache[regex].search(line)

then the loop we can do a normal

     research(r'''some regex''', line)

so we can get readability and full caching together. Probably not worth
repeating this trick for every script, but certainly the whitespace
script and a few others probably benefit.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list