On Mon, Feb 01, 2021 at 07:45:35PM +0000, Jeff Hostetler via GitGitGadget wrote: > -static int packet_write_gently(const int fd_out, const char *buf, size_t size) > +/* > + * Use the provided scratch space to build a combined <hdr><buf> buffer > + * and write it to the file descriptor (in one write if possible). > + */ > +static int packet_write_gently(const int fd_out, const char *buf, size_t size, > + struct packet_scratch_space *scratch) Thanks for addressing my stack space concern. This solution does work (and I like wrapping it in a struct like this), though I have to wonder if we're not just punting on the thread issues in an ever-so-slight way with things like this: > void packet_write(int fd_out, const char *buf, size_t size) > { > - if (packet_write_gently(fd_out, buf, size)) > + static struct packet_scratch_space scratch; > + > + if (packet_write_gently(fd_out, buf, size, &scratch)) > die_errno(_("packet write failed")); > } Where we just moved it one step up the call stack. > int write_packetized_from_fd(int fd_in, int fd_out) > { > + /* > + * TODO We could save a memcpy() if we essentially inline > + * TODO packet_write_gently() here and change the xread() > + * TODO to pass &buf[4]. > + */ And comments like this make me wonder if the current crop of pktline functions are just mis-designed in the first place. There are two obvious directions here. One, we can observe that the only reason we need the scratch space is to ship out the whole thing in a single write(): > [in packet_write_gently] > - set_packet_header(packet_write_buffer, packet_size); > - memcpy(packet_write_buffer + 4, buf, size); > - if (write_in_full(fd_out, packet_write_buffer, packet_size) < 0) > + > + set_packet_header(scratch->buffer, packet_size); > + memcpy(scratch->buffer + 4, buf, size); > + > + if (write_in_full(fd_out, scratch->buffer, packet_size) < 0) > return error(_("packet write failed")); Would it really be so bad to do: char header[4]; set_packet_header(header, packet_size); if (write_in_full(fd_out, header, 4) < 0 || write_in_full(fd_out, buf, size) < 0) return error(...); I doubt that two syscalls is breaking the bank here, but if people are really concerned, using writev() would be a much better solution. Obviously we can't rely on it being available everywhere, but it's quite easy to emulate with a wrapper (and I'd be happy punt on any writev stuff until somebody actually measures a difference). The other direction is that callers could be using a correctly-sized buffer in the first place. I.e., something like: struct packet_buffer { char full_packet[LARGE_PACKET_MAX]; }; static inline char *packet_data(struct packet_buffer *pb) { return pb->full_packet + 4; } That lets people work with the oversized buffer in a natural-ish way that would be hard to get wrong, like: memcpy(packet_data(pb), some_other_buf, len); (though if we wanted to go even further, we could provide accessors that actually do the writing and sanity-check the lengths; the downside is that I'm not sure how callers typically get the bytes into these bufs in the first place). That's a much bigger change, of course, and I'd guess you'd much prefer to focus on the actual point of your series. ;) -Peff