Powered by Linux
Re: smatch on illumos & false positive — Semantic Matching Tool

Re: smatch on illumos & false positive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok.  I've pushed the two patches I mentioned earlier.

On Tue, Jan 06, 2015 at 12:13:12PM -0500, Josef 'Jeff' Sipek wrote:
> ...
> > The next question is should I make:
> > 	test.c:34 main() implied: arch = '197001,201201'
> > into:
> > 	test.c:34 main() implied: arch = '201201'
> > I don't think I want to do that.  It's a philosophical debate because
> > how should you handle impossible conditions?  In this case we know that
> > the date is valid so we could ignore that whole if condition but the
> > problem is that Smatch is often wrong about the stuff which it "knows"
> > so I normally pretend that nothing is impossible.
> 
> Hrm?  I'm a little confused about how that's possible.  I am (naively?)
> assuming that smatch incrementally builds its view of the world by assuming
> nothing and using logic to determine what's possible and what's not.  I'm
> not sure where this uncertainty could come from.
> 

One problem is that there is a lot of bugs in impossible to reach code
so I want to warn about those.  There is a lot of impossible to reach
code, btw.

But a bigger problem for Smatch is that it's super bad about handling
loops.  It just goes through it one time.  It handles canonical for
loops as a special case.  So it thinks a lot of code is impossible when
really it's not.

Also Smatch assumes there is only one thread so some conditions which
look impossible actually are possible with multi-threading.

> > Another idea that occurs to me is that '197001,201201' % 100 can be
> > calculated as '197001 % 100, 201201 % 100' and we know that is '1'
> > instead of 0-99 which makes the error message go away.
> 
> Is the buffer overflow check doing a "int % 100" instead of
> "<possible values> % 100"?

Yeah.  But now, I have changed that so it says '1' as you suggest.

> 
> > But none of that helps in the real life case where we don't start out
> > with "arch = 201201;".
> 
> Agreed.  Here's what I get with "int arch = atoi(argv[1]);":
> 
> $ ../oss/smatch/smatch -D__amd64 test.c 
> test.c:26 main() implied: arch / 100 = ''
> test.c:27 main() implied: arch % 100 = '0-99'
> test.c:28 main() implied: validate_arch_id(arch) = '0-1'
> test.c:29 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
> test.c:34 main() implied: arch = 's32min-s32max'
> test.c:36 main() error: buffer overflow 'months' 12 <= 98
> 
> > The problem is Smatch only tracks whole
> > variables so it understands "var <= 12" but it doesn't understand
> > "arch % 100 <= 12".  I have some ideas about changing this, but it's
> > going to take months before I can look at it.  Also to track this across
> > function boundaries?  Forget about that...  :P
> 
> Heh, I knew this was a bit of a tall order when I was writing my original
> email :)
> 
> Huh... now I'm confused.  I changed the code so that "validate" would
> actually return the year & month components:
> 
> static int parse_arch_id(int arch, int *y, int *m)
> {
> 	*y = arch / 100;
> 	*m = arch % 100;
> 
> 	return (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100);

This test needs to be updated by the way.  The "m" should be "*m" now.
When you fix that then it will get the months correct as "1-12".  But
gar...  It still doesn't get the year right that it's '1970-2099'.  Ok.
I have written a patch for this and I will try test it and push
tomorrow.

> }
> 
> Then, in main I do:
> 
> 	if (!parse_arch_id(arch, &y, &m)) {
> 		arch = 197001;
> 		y = 1970;
> 		m = 1;
> 	}
> 
> 	__smatch_implied(arch);
> 	__smatch_implied(y);
> 	__smatch_implied(m);
> 
> 	printf("%s\n", months[m - 1]);
> 
> This apparently is enough for smatch to not be confused:
> 
> test2.c:30 main() implied: arch / 100 = '2012'
> test2.c:31 main() implied: arch % 100 = '0-99'
> test2.c:32 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
> test2.c:34 main() implied: parse_arch_id(arch, &y, &m) = '0-1'
> test2.c:42 main() implied: arch = '197001,201201'
> test2.c:43 main() implied: y = '1970,2012'
> test2.c:44 main() implied: m = '0-99'
> 
> No sign of the buffer overflow error.  If I use atoi() instead of a hardcoded
> constant, I get very similar output - again without an error:

The MOD is getting lost in the cross function transition...  I should
probably figure out why and make it print an error consistently...

> 
> test2.c:30 main() implied: arch / 100 = ''
> test2.c:31 main() implied: arch % 100 = '0-99'
> test2.c:32 main() implied: (m >= 1) && (m <= 12) && (y >= 1970) && (y < 2100) = '0-1'
> test2.c:34 main() implied: parse_arch_id(arch, &y, &m) = '0-1'
> test2.c:42 main() implied: arch = 's32min-s32max'
> test2.c:43 main() implied: y = 's32min-s32max'
> test2.c:44 main() implied: m = '0-99'
> 
> 
> Ok, one more question: shouldn't y have a more restrited range than
> s32min-s32max?
> 
> Specifically, shouldn't it be:
> 
> y = [-21474837, 21474836]
> 
> instead of:
> 
> y = [-2147483648L, 2147483647]
> 

It's not ideal, but it's deliberate.  's32min-s32max' means unknown.
An unknown value divided by 100 is an unknown value.  If we said it was
'(-21474837)-21474836' then we'd have to start warning about array
over/underflows.

But you're right, it's not great.  I'll have to think about this.

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe smatch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux