Re: [PATCH 2/2] builtin/cat-file.c: support NUL-delimited input with `-z`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio,

On Sat, Jul 23, 2022 at 10:45:05AM -0700, Junio C Hamano wrote:
> > So isn't this just an accident in of us having used the strbuf_getline()
> > function to mean "\n", but actually it also does "\r\n".
> >
> > Which is a really unfortunately named function b.t.w., since it sneaks
> > this bit of Windows portability into places that may not want it in the
> > first place.
>
> getline() is to read a single "logical" line, so it is fine for it
> to strip CRLF on platforms with CRLF, and to leave CR at the end of
> line on a LF platform.  If the "protocol" is defined to use LF on
> any platform (and allow a payload that ends with CR to be passed),
> you can argue that it is a wrong helper to call.

Reading your email up to this point makes me think that we should ignore
any CR bytes we see next to a LF. So by this point I think that we
should take Ævar's suggestion and call:

    strbuf_getwholeline(&buf, stdin, opt->nul_terminated ? '\0' : '\n');

But...

> But does that result in a sensible behaviour?  I am not sure.  Some
> editors that are popular on LF platforms can produce CRLF files when
> the user asks (either on purpose or by mistake), and when not using
> the "-z" mode of input, I suspect that most users would expect CR at
> the end of the "line" (terminated with LF on their platform of
> choice) would be thrown away even on their LF platform, simply
> because it is unlikely that the kind of input they are preparing can
> usefully and legitimately end with CR, as their colleagues on CRLF
> platforms may also have to produce similar input.  IOW, the presence
> of CRLF platforms makes text lines that end with CR much less useful
> everywhere.
>
> And from that point of view, "getline() or getwholeline(NUL)" may be
> a pattern that is practically more useful than the old "use always
> getwholeline(NUL or LF)" convention that I invented more than 10
> years ago.

This makes me think that we should retain support for dropping the CR
preceeeding a LF and treat it as a historical wart that we are stuck
supporting.

Do you have a preference? I am fine with either approach, FWIW.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux