On Mon, May 10, 2021 at 06:17:27AM +0000, Reshetova, Elena wrote: > > On Fri, May 07, 2021 at 02:22:37PM +0000, Reshetova, Elena wrote: > > > Hi, > > > > > > I have been working for a while now on a new smatch pattern, but > > > would really appreciate additional information points such as past > > > email discussions, etc. > > > > > > So I am wondering if there is a way to browse through > > > the archives of this mailing list in order to try to find the > > > information I need? > > > > Sorry, I don't think it's archived anywhere. There isn't a lot of > > traffic on the list. About three times a year someone reports that > > Smatch is crashing for them. > > > > I'm always happy to answer questions if there is any way I can help? > > Thank you Dan! I am pretty new with smatch so that's why I was > hoping to browse through the existing mails to see if my simple questions > are already answered, but here is my current issue. > > What is the best way to create identifiers for the findings that certain smatch > pattern finds in the kernel? Let's say I have a new pattern that is able to find > different problematic places and report them in usual smatch way: errors and > warnings with file name, line number, function name, etc. > Now for our pattern in order to be sure that the reported issue exists/does not > exists, somebody needs to go and look at the code manually and make a call. > After this, it would be nice to mark this place as safe/concern in the report and be > able to transfer these results for kernel versions bumps (5.11->5.12, etc.) as soon as > the code in this function where finding was reported has not changed (and there > might be multiple findings per function). > > What is the best way of doing it? > I was first thinking of using some simple hash for the reported line (lines around, relative > position within the reported function), > but now I think I need also to hash the whole function in addition to the finding itself. > > Then the logic of transferring the result would be: > > For each finding calculate: > 1. finding_line_hash: the hash of the line that resulted in finding (becomes a unique id > within the function). > 2. finding_function_hash: the hash of the function that produced the finding (becomes a > unique global id within the kernel) and helps to determine if the function has not been > changed between the kernel versions. > > Logic for the result transfer: > > If both finding_line_hash and finding_function_hash match between the two smatch reports > for two different versions, then it is relatively safe to transfer this concrete smatch finding > and its manual audit result automatically. > > Does it make sense overall? If yes, what is the easiest way in smatch to get hash data for > 1 and 2? I.e. get full reported line as a string and full function content as a string? I use the a script smatch_scripts/new_bugs.pl It strips out the variables names from the single quotes and any numbers and the parentheses so it looks like this: Original warning: fs/fuse/virtio_fs.c:1468 virtio_fs_get_tree() error: double free of 'fm' Stripped: fs.fuse.virtio_fs.c.virtio_fs_get_tree_error:_double_free_of_'' You could hash the stripped string. Looking at it now, the variable name is actually useful and shouldn't be stripped out. Doh... I don't know what the zero day bot does for this to mark warnings as dealt with or not. There is also the Aiaiai project (https://www.openhub.net/p/aiaiai) which probably has a feature for marking warnings as reviewed. regards, dan carpenter