On Mon, 20 Jan 2025 09:30:48 -0800 Breno Leitao wrote: > > > Not sure I followed. The data ({userdata,extradata}_complete) was always > > > inside nt field, which belongs to target_list. > > > > I mean the buffer we use for formatting. Today it's this: > > > > static char buf[MAX_PRINT_CHUNK]; /* protected by target_list_lock */ > > int header_len, msgbody_len; > > const char *msgbody; > > > > right? I missed that "static" actually so it's not on the stack, > > it's in the .bss section. > > Since you raised this topic, I don't think buf needs to be static > for a functional perspective, since `buf` is completely overwritten > every time send_msg functions are called. It may be because it's relatively big and stack space used to be very limited. > > My thinking was to handle it like the release. > > Print it at the send_msg_no_fragmentation() stage directly > > into the static buffer. Does that get hairy coding-wise? > > I suppose the advantage of doing this approach is to reduce a > memcpy/strcpy, right? Not really, my main motivation is to try to find a common way of how various pieces of the output are protected and handled. > If this is what your motivation, I think we cannot remove it from the > fragmented case. Let me share my thought process: > > 1) sysdata needs to be appended to both send_msg_fragmented() and > send_msg_no_fragmentation(). The fragmented case is the problem. > > 2) It is trivially done in send_msg_fragmented() case. > > 3) For the send_msg_no_fragmentation() case, there is no trivial way to > get it done without using a secondary buffer and then memcpy to `buf`. > > Let's suppose sysdata has "cpu=42", and original `buf` has only 5 available > chars, thus it needs to have 2 msgs to accommodate the full message. > > Then the it needs to track that `cpu=4` will be sent in a msg and create > another message with the missing `2`. > > The only way to do it properly is having a extra buffer where we > have `cpu=42` and copy 5 bytes from there, and then copy the last one in > the next iteration. I am not sure we can do it in one shot. FWIW to simplify reasoning about the length I thought we could take the worst case, assume we'll need len(cpu=) + log10(nr_cpu_ids) of space. > On top of that, I am planning to increase other features in sysdata > (such as current task name, modules and even consolidate the release as > sysdata), which has two implications: > > 1) Average messages size will become bigger. Thus, memcpy will be needed > one way or another. > > 2) Unless we can come up with a smart solution, this solution will be > harder to reason about. > > If you want to invest more time in this direction, I am more than happy > to create a PoC, so we can discuss more concretely. I don't feel super strongly about this. But hacking around is always good to get a sense of how hairy the implementation ends up being. To rephrase my concern is that we have some data as static on the stack, some dynamically appended at the send_*() stage, now we're adding a third way of handling things. Perhaps the simplest way to make me happy would be to move the bufs which are currently static into nt.