On 12/22/22 1:53 AM, Thorsten Glaser wrote: > On Sat, 26 Nov 2022, Alexey Dobriyan wrote: > >> /proc never escaped "comm" field of /proc/*/stat. > > Yes, that’s precisely the bug. > >> To parse /proc/*/stat reliably, search for '(' from the beginning, then >> for ')' backwards. Everything in between parenthesis is "comm". > > That’s not guaranteed to stay reliable: fields can be, and have > been in the past, added, and new %s fields will break this. Do > not rely on it either. > >> Everything else are numbers separated by spaces. > > Currently, yes. > > But the field is *clearly* documented as intended to be > parsable by scanf(3), which splits on white space. So the > Linux kernel MUST encode embedded whitespace so the > documented(!) access method works. No, Escaping would break existing programs which parse the line by searching for the ')' from the right. The format, surly, is ugly, but that is how it is. If some documentation suggests, that you can just parse it with scanf, the documentation should be corrected/improved instead. Are you referring to proc(5) "The fields, in order, with their proper scanf(3) format specifiers, are listed below" [1] or something else? The referenced manual page is wrong in regard to the length, too. There is no 16 character limit to the field, because it can contain a workqueue task name, too: buczek@theinternet:/tmp$ cat /proc/27190/stat 27190 (kworker/11:2-mm_percpu_wq) I 2 0 0 0 -1 69238880 0 0 0 0 0 170 0 0 20 0 1 0 109348986 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 11 0 0 0 0 0 0 0 0 0 0 0 0 0 The current limit seems to be 64 characters [2] when escaping is off, as it is the case with /proc/pid/stat. But generally the length of the field and thereby of the whole line seems to be rather undefined. So to parse that, you either either need to do some try-and-restart-with-a-bigger-buffer dance or use a buffer size of which you just hope that it will be big enough for the forseable time. In fact, if you start escaping now you might also break programs which rely on the current 64 character limit. Best Donald [1]: https://man7.org/linux/man-pages/man5/proc.5.html [2]: https://elixir.bootlin.com/linux/latest/source/fs/proc/array.c#L99 > > bye, > //mirabilos > -- Donald Buczek buczek@xxxxxxxxxxxxx Tel: +49 30 8413 1433