On Tue, Feb 02, 2021 at 11:54:43PM +0100, Johannes Schindelin wrote: > > Would it really be so bad to do: > > > > char header[4]; > > set_packet_header(header, packet_size); > > if (write_in_full(fd_out, header, 4) < 0 || > > write_in_full(fd_out, buf, size) < 0) > > return error(...); > > There must have been a reason why the original code went out of its way to > copy the data. At least that's what I _assume_. Having looked at the history, including the original mailing list threads, it doesn't seem to be. > I could see, for example, that these extra round-trips just for the > header, really have a negative impact on network operations. Keep in mind these won't be network round-trips. They're just syscall round-trips. The OS would keep writing without an ACK while filling a TCP window. The worst case may be an extra packet on the wire, though the OS may end up coalescing the writes into a single packet anyway. > > I doubt that two syscalls is breaking the bank here, but if people are > > really concerned, using writev() would be a much better solution. > > No, because there is no equivalent for that on Windows. And since Windows > is the primary target of our Simple IPC/FSMonitor work, that would break > the bank. Are you concerned about the performance implications, or just portability? Falling back to two writes (and wrapping that in a function) would be easy for the latter. For the former, there's WSASend, but I have no idea what kind of difficulties/caveats we might run into. > > The other direction is that callers could be using a correctly-sized > > buffer in the first place. I.e., something like: > > > > struct packet_buffer { > > char full_packet[LARGE_PACKET_MAX]; > > }; > > static inline char *packet_data(struct packet_buffer *pb) > > { > > return pb->full_packet + 4; > > } > > Or we change it to > > struct packet_buffer { > char count[4]; > char payload[LARGE_PACKET_MAX - 4]; > }; > > and then ask the callers to allocate one of those beauties > Not sure how well we can guarantee that the compiler won't pad this, > though. Yeah, I almost suggested the same, but wasn't sure about padding. I think the standard allows there to be arbitrary padding between the two, so it's really up to the ABI to define. I'd be surprised if this struct is a problem in practice (we already have some structs which assume 4-byte alignment, and nobody seems to have complained). > And then there is `write_packetized_from_buf()` whose `src` parameter can > come from `convert_to_git()` that _definitely_ would not be of the desired > form. Yep. It really does need to either use two writes or to copy, because it's slicing up a much larger buffer (it wouldn't be the end of the world for it to allocate a single LARGE_PACKET_MAX heap buffer for the duration of its run, though). > So I guess if we can get away with the 2-syscall version, that's kind of > better than that. I do prefer it, because then the whole thing just becomes an implementation detail that callers don't need to care about. -Peff