Hi Steven: Thanks for your reply. I also add Mauro to the discussion. On 2023/8/5 0:15, Steven Rostedt wrote:
I find this kernel patch(https://lore.kernel.org/all/477b362dba1ce7fab9889a1a8e885a62c472f041.1516069914.git.tom.zanussi@xxxxxxxxxxxxxxx/T/#u) change TIME_STAMP to 8 bytes. Maybe this commit is the first time RINGBUF_TYPE_TIME_STAMP is really implemented in the kernel.On Fri, 4 Aug 2023 17:52:48 +0800 Lv Ying <lvying6@xxxxxxxxxx> wrote:Hi, all: I am a rasdaemon developer which depeneds on libtraceevent to parse kernel trace events. There is coupling between libtraceevnt and kernel trace, if something in libtraceevent and kernel does not match, which will cause libtraceevent parse wrong thing. e.g https://github.com/mchehab/rasdaemon/pull/98 We also encounter similar problem: * libtrace(old) KBUFFER_TYPE_TIME_STAMP size = 12 * kernel(new) KBUFFER_TYPE_TIME_STAMP size = 8 Such mismatch will cause strange behavior when parsing trace events.So what happened was the old 12 byte version of TIME_STAMP was never actually implemented in the kernel. When we finally got around to implementing it, we only needed 8 bytes for it, so it became 8 bytes. I made the mistake of adding that code in kbuffer.c before it was ever implemented in the kernel and said it would be 12 bytes.
When rasdaemon use old libtrace(12 byte version of TIME_STAMP) on newer kernel(8 bytes version of RINGBUF_TYPE_TIME_STAMP). Libtrace takes too long to parse timestamp events and appends the first 4 bytes of the next trace event as the last 4 bytes of the timestamp event. Such wrong parsing will lead to completely wrong parsing of the adjacent trace event field. For example, in rasdaemon we have a situation like this, libtrace parse devlink_health_report trace event next timestamp trace event as block_rq_complete trace event, which will cause coredump in block_rq_complete rasdaemon handler.
I am worried similar libtraceevent-kernel compatibility problem. I think it's a good way to sort out the interfaces that libtraceevent depends on the kernel and add test cases to determine whether libtraceevent can run on the current kernel, and to know which interfaces do not match.So if libtraceevent is released out of(independent) kernel, how does libtraceevnt to keep compatible with the running kernel(maybe not the newest)?Now that it has been implemented, it's not going to change. 8 bytes is now an API. Any more updates should not cause a problem with libtraceevent as there's many more tools that depend on it working. And the fact that it is no longer in the kernel, guarantees more that the interface will remain stable.
rasdaemon should be using the external libtraceevent library because it will be able to get more information out of any new data. Newer kernels should not break existing libtracevent, but it may just skip over new features. -- Steve .
rasdaemon now use the external libtraceevent, but rasdaemon will be shipped in many version OS by OSV, so how to ensure that rasdaemon runs correctly on various kernel versions using external libtraceevent needs to be considered.
-- Thanks! Lv Ying