Powered by Linux
Re: smatch on illumos & false positive — Semantic Matching Tool

Re: smatch on illumos & false positive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 06, 2015 at 12:36:08PM +0300, Dan Carpenter wrote:
> On Mon, Jan 05, 2015 at 01:54:46PM -0500, Josef 'Jeff' Sipek wrote:
> > While I'm writing this email, I might as well point out a false positive
> > smatch told me about.  I have a program that takes an year and month as a 6
> > digit number (e.g., 201501) and I want double check that it's valid.  (The
> > first 4 digits look like a year and the last two digits look like a month
> > number.)  Later on, I use the month part to index into an array of month
> > names.  Here's a simplified test case:
> 
> I've changed your test case a little to add some smatch debugging stuff:
> 
> #include <stdio.h>
> #include <unistd.h>
> 
> #include "check_debug.h"
> 
> static int validate_arch_id(int arch)
> {
> 	int y = arch / 100;
> 	int m = arch % 100;
> 
> 	return (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100);
> }
> 
> int main(int argc, char **argv)
> {
> 	static const char *months[12] = {
> 		"January", "February", "March", "April", "May", "June",
> 		"July", "August", "September", "October", "November",
> 		"December",
> 	};
> 
> 	int arch = 201201; /* test */
> 	int y = arch / 100;
> 	int m = arch % 100;
> 
> 	__smatch_implied(arch / 100);
> 	__smatch_implied(arch % 100);
> 	__smatch_implied(validate_arch_id(arch));
> 	__smatch_implied((m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100));
> 
> 	if (!validate_arch_id(arch))
> 		arch = 197001;
> 
> 	__smatch_implied(arch);
> 
> 	printf("%s\n", months[(arch % 100) - 1]);
> 
> 	return 0;
> }

Neat, I didn't know about __smatch_implied.  Looks quite useful.

...
> The next question is should I make:
> 	test.c:34 main() implied: arch = '197001,201201'
> into:
> 	test.c:34 main() implied: arch = '201201'
> I don't think I want to do that.  It's a philosophical debate because
> how should you handle impossible conditions?  In this case we know that
> the date is valid so we could ignore that whole if condition but the
> problem is that Smatch is often wrong about the stuff which it "knows"
> so I normally pretend that nothing is impossible.

Hrm?  I'm a little confused about how that's possible.  I am (naively?)
assuming that smatch incrementally builds its view of the world by assuming
nothing and using logic to determine what's possible and what's not.  I'm
not sure where this uncertainty could come from.

> Another idea that occurs to me is that '197001,201201' % 100 can be
> calculated as '197001 % 100, 201201 % 100' and we know that is '1'
> instead of 0-99 which makes the error message go away.

Is the buffer overflow check doing a "int % 100" instead of
"<possible values> % 100"?

> But none of that helps in the real life case where we don't start out
> with "arch = 201201;".

Agreed.  Here's what I get with "int arch = atoi(argv[1]);":

$ ../oss/smatch/smatch -D__amd64 test.c 
test.c:26 main() implied: arch / 100 = ''
test.c:27 main() implied: arch % 100 = '0-99'
test.c:28 main() implied: validate_arch_id(arch) = '0-1'
test.c:29 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
test.c:34 main() implied: arch = 's32min-s32max'
test.c:36 main() error: buffer overflow 'months' 12 <= 98

> The problem is Smatch only tracks whole
> variables so it understands "var <= 12" but it doesn't understand
> "arch % 100 <= 12".  I have some ideas about changing this, but it's
> going to take months before I can look at it.  Also to track this across
> function boundaries?  Forget about that...  :P

Heh, I knew this was a bit of a tall order when I was writing my original
email :)

Huh... now I'm confused.  I changed the code so that "validate" would
actually return the year & month components:

static int parse_arch_id(int arch, int *y, int *m)
{
	*y = arch / 100;
	*m = arch % 100;

	return (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100);
}

Then, in main I do:

	if (!parse_arch_id(arch, &y, &m)) {
		arch = 197001;
		y = 1970;
		m = 1;
	}

	__smatch_implied(arch);
	__smatch_implied(y);
	__smatch_implied(m);

	printf("%s\n", months[m - 1]);

This apparently is enough for smatch to not be confused:

test2.c:30 main() implied: arch / 100 = '2012'
test2.c:31 main() implied: arch % 100 = '0-99'
test2.c:32 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
test2.c:34 main() implied: parse_arch_id(arch, &y, &m) = '0-1'
test2.c:42 main() implied: arch = '197001,201201'
test2.c:43 main() implied: y = '1970,2012'
test2.c:44 main() implied: m = '0-99'

No sign of the buffer overflow error.  If I use atoi() instead of a hardcoded
constant, I get very similar output - again without an error:

test2.c:30 main() implied: arch / 100 = ''
test2.c:31 main() implied: arch % 100 = '0-99'
test2.c:32 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
test2.c:34 main() implied: parse_arch_id(arch, &y, &m) = '0-1'
test2.c:42 main() implied: arch = 's32min-s32max'
test2.c:43 main() implied: y = 's32min-s32max'
test2.c:44 main() implied: m = '0-99'


Ok, one more question: shouldn't y have a more restrited range than
s32min-s32max?

Specifically, shouldn't it be:

y = [-21474837, 21474836]

instead of:

y = [-2147483648L, 2147483647]


Thanks,

Jeff.

-- 
All parts should go together without forcing.  You must remember that the
parts you are reassembling were disassembled by you.  Therefore, if you
can’t get them together again, there must be a reason.  By all means, do not
use a hammer.
		— IBM Manual, 1925
--
To unsubscribe from this list: send the line "unsubscribe smatch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux