Hi Bruno and Cameron, On Mon, Mar 19, 2012 at 07:04, Cameron Simpson <cs@xxxxxxxxxx> wrote: > On 18Mar2012 20:19, Bruno Wolff III <bruno@xxxxxxxx> wrote: > | On Sun, Mar 18, 2012 at 23:46:17 +0100, > | suvayu ali <fatkasuvayu+linux@xxxxxxxxx> wrote: > | > > | >I'm trying to write a regular expression that matches function and class > | >definitions in C/C++ and defuns in lisp code. I intend to use it with > | >sed and `git blame'. My first attempt relies on indentation. That > | >obviously breaks rather often. > | > | Mathematically, regular expressions can't match braces like this to an > | unbounded depth. You might be able to use extensions to common regular > | expression implementations that aren't strictly regular expressions to do this. > > In particular, regular expressions are not recursive. Hence not capable > of arbitrarily matching nested constructs. But you _can_ construct one > to match a certain depth. For example, four or five deep probably covers > most things in reasonable code (lisp excluded; that is naturally very > bracket intensive). > > If you're using sed you probably want the "extended regular expressions > mode", turned on by -E in GNU sed IIRC. > > Personally, for easy of debugging, I would construct the regexp from > smaller pieces in shell. Untested example: > > no_br='[^()]*' # no brackets > upto1="${no_br}|(\(${no_br}\))" # no brackets or "no brackets" in brackets > upto2="${no_br}|(\(${upto1}\))" > upto3="${no_br}|(\(${upto2}\))" > upto4="${no_br}|(\(${upto3}\))" > sed -E -n "/^${upto4}\$/!p" <blame-data >blame-bad-bracketing > > | It might be easier to write your own parser. There are tools, like flex > | and bison to help with this. > > Indeed. Or you could write a hand rolled recursive descent parser in > what ever language you like provided it makes character by character > access fairly easy (C, python, etc; not awk or sed). > I was expecting this to be the case, after all using regular expressions is not the same as using a programming language. I'll think how I can adapt my use case as per your suggestions. Thanks a lot for the different pointers. :) Cheers, -- Suvayu Open source is the future. It sets us free. -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org