Am 10.01.20 um 19:15 schrieb Ryan Zoeller: > On Friday, January 10, 2020 11:43 AM, Johannes Sixt <j6t@xxxxxxxx> wrote: >> Am 10.01.20 um 04:10 schrieb Ryan Zoeller via GitGitGadget: >>> - /* Real and complex literals */ >>> - "|[-+0-9.e_(im)]+" >> >> I am curious: is '(1+2i)' a single literal -- including the parentheses? >> The expression would also mistake the character sequence '-1)+(2+' as a >> single word; is it intended? > > This part of the regular expression has a pretty major mistake due > to me misunderstanding how the parentheses were being interpreted. > It should be something along the lines of `([-+0-9.e_]|im)+`. > > Julia uses `im` as the designation for an imaginary value; this regex > was intended to admit e.g. 1+2im, in addition other numeric values > such as 1_000_000 and 1e10. I see. I suggest to treat 1+2im as three words '1', '+', and '2im', and to model numbers in this way: |[0-9][0-9_.]*(e[-+]?[0-9_]*)?(im)? In particular, require a digit at the begin, and do not allow '-' and '+' an arbitrary number of times, because it would catch 1+2+3+4 as a single word. -- Hannes