On Wed, Mar 29, 2023 at 5:19 AM Felipe Contreras <felipe.contreras@xxxxxxxxx> wrote: > On Wed, Mar 29, 2023 at 3:58 AM Ævar Arnfjörð Bjarmason > <avarab@xxxxxxxxx> wrote: > > But we do need to carry some hacks going forward, some of it seems > > pretty isolated & easy to spot, but e.g. the 6/6 fix of: > > > > - if test "$c" = " " > > + if test "$c" = " " || test -z "$c" > > > > Is quite subtle, you might look at that and be convinced that the RHS is > > redundant, and be right, but only because you assume POSIX semantics. Actually, that isn't even true (see below). > > If we are going to include this I think the relevant t/README and > > Documentation/CodingGuidelines parts should be updated to note that > > we're not targeting POSIX shellscripts anymore, but the subset of it > > that zsh is happy with. But in this particular case the exact opposite is true: the script is *not* POSIX, it just happens to work on bash and other shells. You *assume* it's POSIX because it works on bash and it doesn't work on zsh, but in this particular case bash is the non-POSIX one, zsh is following POSIX correctly. > There's no point in that. I consider it a bug in zsh, along with 5/6, > so presumably at some point it's going to be fixed. Actually, no. I've changed my mind. I was going to report to the zsh dev mailing list the fact that this created an extra empty field at the end (in sh mode): IFS=, ; str='foo,bar,,roo,'; printf '"%s"\n' $str But then I read the POSIX specification, and the section 2.6.5 Field Splitting [1] is very clear on what should happen. What muddles the waters is the distinction between `IFS white space` characters (newline, space and tab), and non-`IFS white space` characters (all the other). If we ignore all the shite space stuff and concentrate on the rules for non-`IFS white space` characters (as comma is), then we arrive at this subitem: 3.b. "Each occurrence in the input of an IFS character that is not IFS white space, along with any adjacent IFS white space, shall delimit a field, as described previously." In other words: each occurence of a non-`IFS white space` character shall delimit a field. Or: each occurence of a comma should delimit a field. The script only works if the last delimiter does *not* delimit a field, and thus it's not following POSIX, it just happens to work on most shells. My patch does make it align with POSIX. I've reported bash's non-compliance with POSIX to their mailing list [2]. But I bet nobody here will care, because POSIX is just an excuse to segregate the shells the main developers want to make work, from the ones they are not (Brian even used the language of certain shells being one of "the good ones"). Cheers. [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05 [2] https://lists.gnu.org/archive/html/bug-bash/2023-03/msg00152.html -- Felipe Contreras