Hi Oskari, On 4/15/23 09:21, Oskari Pirhonen wrote: > Hi Alex, > > On Fri, Apr 14, 2023 at 13:26:59 +0200, Alejandro Colomar wrote: >> Hi Oskari, >> >> On 4/14/23 05:19, Oskari Pirhonen wrote: >>> Remove leading whitespace and collapse multi-line declarations into a >>> single line using (g)awk. >> >> I can't reak awk(1) :( >> > > Awww man, but I even left the optional semicolons in... :-) > >> But I like the idea. I implemented the same using sed(1) after your >> suggestion. Does the below patch look good to you? >> > > I actually had an earlier version with sed(1), but it used > looping/branching to handle the multi-line bits, so I figured it was a > bit ugly and didn't send it. I didn't think to try `-z`. > > It seems to do the same thing, so LGTM. Good. > >> Cheers, >> Alex >> >> P.S.: I forgot about writing a man page. I'll start now. >> > > I was about to say "and license file and appropriate blurb" but then I > saw your commit. I've got some suggestions for the man page, so I'll > send some patches sometime soon. Sure, please! > > - Oskari > > > Since you said you can't read awk, then just to satisfy your curiosity, Thanks! > here's what was going on: > > BEGIN { > RS = ";\n" > ORS = RS > } > > This block is run at the start before any records are processed. > > The default Record Separator is "\n", but here we set it to ";\n". In > (g)awk, a value of `RS` that is >1 char is actually a regex, but we only > need to match a literal string. The Output Record Separator is by > default also "\n". > > { > gsub(/\n/, " ") > sub(/^ +/, "") > gsub(/ +/, " ") > print > } > > This block is run on all records, since it doesn't have any patterns for > contitional execution attached to it. > > `gsub()` does an in-place global regex replace on a string, similar to > the `s/regex/replace/g` you're familiar with. It takes an optional third > arg, but if it's left out then it has an implicit `$0` which means the > entire current record. `sub()` is like `gsub()`, but only does the first > match, similar to `s/regex/replace/`. > > `print`, without any args, prints the current record followed by `ORS`. > This was set earlier because the `RS` is consumed from the input as it > is being processed record by record, but we want to keep the output > looking intact. > > Hopefully not so bad after all. Meh, I'll need some time to get used to it, I guess :) > Awk is pretty nice IMO (and gawk in > particular), and I would recommend checking it out if you find yourself > borded one day :) So far, I've almost always found one way or another with more specific commands. The only case I find awk(1) interesting is for processing columns; it's quite good at that (e.g., for printing column sum totals). The other alternative I can think of is pee(1), and process each column separately, which might be more readable (for me). But I might give it a try some day. :) For now, awk(1) is only for `awk '{print $3}'` to me. Cheers, Alex -- <http://www.alejandro-colomar.es/> GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature