Re: bash/regex question..

cs@xxxxxxxxxx · Wed, 14 Sep 2016 10:21:54 +1000

On 13Sep2016 14:27, bruce <badouglas@xxxxxxxxx> wrote:
I'm doing 100s/1000s of these..

Still a smallish number.

but.. the othe parts of th operation are
longer/more compute expensive.. this is essentially noise in the scheme of
things.. and i'm fairly certain the resource usage is a wash as well for
the diff approaches

If you have speed issues, one thing to keep in mind is that with the shell, 
fork/exec will often outweigh any compute cost in a regexp.

The core deal with interpreted languages (shell, perl, python etc) is that you 
should try to minimise the cost of the interpretation. So the classic shell 
example looks like this:

 # expensive, fires off separate commands per file
 for f in *.txt
 do
     cat "$f" | awk | sed | ...
 done

vs:

 for f in *.txt
 do
     file-specific-awk ...
 done | generic-sed | ...

i.e. a theoretical example where you can run a whole bunch of things though a 
single sed or whataver outside the loop. Obvious very dependent on your exact 
problem, but I hope the approach is evident.

There are flipsides to this of course; if the code becomes unreadable or 
unmaintainable then the programmer cost can exceed the actual real world 
runtime benefit, particularly for once-off tasks.

Re the unreadable bit, that is one of my rules of thumb for switching 
languages: when the task is no longer clearly and succinctly expressed.

Cheers,
Cameron Simpson <cs@xxxxxxxxxx>
--
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://lists.fedoraproject.org/admin/lists/users@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org