On 7/26/24 23:05, Sami Tolvanen wrote: > On Mon, Jul 22, 2024 at 8:20 AM Petr Pavlu <petr.pavlu@xxxxxxxx> wrote: >> >> From my perspective, I'm okay if gendwarfksyms doesn't provide >> functionality to compare a new object file with its reference symtypes >> file. >> >> As mentioned, genksyms has this functionality but I actually think the >> way it works is not ideal. Its design is to operate on one compilation >> unit at the time. This has the advantage that a comparison of each file >> is performed in parallel during the build, simply because of the make >> job system. On the other hand, it has two problems. >> >> The first one is that genksyms doesn't provide a comparison of the >> kernel as a whole. This means that the tool gives rather scattered and >> duplicated output about changed structs in the build log. Ideally, one >> would like to see a single compact report about what changed at the end >> of the build. > > Sure, that makes sense. Android uses STG for this, which might be > useful to other folks too: > > https://android.googlesource.com/platform/external/stg/ > https://android.googlesource.com/platform/external/stg/+/refs/heads/main/doc/stgdiff.md#output-formats STG is an interesting tool. I've played with it a bit last year. To be frank, I was surprised to see a new tool being proposed by Google to generate modversion CRCs from DWARF instead of potentially extending your STG project for this purpose. I'm not sure if it is something that you folks have considered and evaluated. >> A few months ago, I also started working on a tool inspired by this >> script. The goal is to have similar functionality but hopefully with >> a much faster implementation. Hence, this tool is written in a compiled >> language (Rust at the moment) and should also become multi-threaded. I'm >> hoping to find some time to make progress on it and make the code >> public. It could later be added to the upstream kernel to replace the >> comparison functionality implemented by genksyms, if there is interest. >> >> So as mentioned, I'm fine if gendwarfksyms doesn't have this >> functionality. However, for distributions that rely on the symtypes >> format, I'd be interested in having gendwarfksyms output its dump data >> in this format as well. > > We can definitely tweak the output format, but I'm not sure if making > it fully compatible with the genksyms symtypes format is feasible, > especially for Rust code. I also intentionally decided to use DWARF > tag names in the output instead of shorthands like s# etc. to make it > a bit more readable. Sure, it might be necessary to extend the symtypes format a bit, for example, by allowing spaces in type names. What other problems do you see? The example I showed preserves the DWARF tag names in type descriptions. Cross-references and the target type names use the s# prefix as they they need to be distinguished from other tokens. >> For example, instead of producing: >> >> gendwarfksyms: process_exported_symbols: _some_mangled_func_name (@ XYZ) >> subprogram( >> [formal parameters...] >> ) >> -> structure_type core::result::Result<(), core::fmt::Error> { >> [a description of the structure...] >> }; >> >> .. the output could be something like this: >> >> S#'core::result::Result<(), core::fmt::Error>' structure_type core::result::Result<(), core::fmt::Error> { [a description of the structure...] } >> _some_mangled_func_name subprogram _some_mangled_func_name ( [formal parameters...] ) -> S#'core::result::Result<(), core::fmt::Error>' > > This wouldn't be enough to make the output format compatible with > symtypes though. genksyms basically produces a simple key-value pair > database while gendwarfksyms currently outputs the fully expanded type > string for each symbol. If you need the tool to produce a type > database, it might also be worth discussing if we should use a bit > less ad hoc format in that case. What I think is needed is the ability to compare an updated kernel with some previous reference and have an output that clearly and accurately shows why CRCs of some symbols changed. The previous reference should be possible to store in Git together with the kernel source. It means it should be ideally some text format and limited in size. This is what distributions that care about stable kABI do in some form currently. This functionality would be needed if some distribution wants to maintain stable Rust kABI (not sure if it is actually feasible), or if the idea is for gendwarfksyms to be a general tool that could replace genksyms. I assume for the sake of argument that this is the case. Gendwarfksyms could implement this functionality on its own, or as discussed, I believe it could provide a symtypes-like dump and a second tool could be used to work with this format and for comparing it.