Powered by Linux
Re: Detecting user data on base types — Semantic Matching Tool

Re: Detecting user data on base types

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 06, 2019 at 12:40:04PM +0300, Dan Carpenter wrote:
> On Wed, Jun 05, 2019 at 09:29:26AM +0100, Andrew Murray wrote:
> > On Thu, May 30, 2019 at 08:46:42PM +0300, Dan Carpenter wrote:
> 
> > > Then I use the smatch_data/db/smdb.py script to figure out the warnings.
> > > I should add an option so that it only shows callers which pass user
> > > data.  Each call site has a unique caller ID.
> > 
> > Thanks - I hadn't looked at that script, but looks very useful.
> > 
> > By the way with the following hunk you can, for a given function, which call sites
> > pass user data.
> > 
> > @@ -614,6 +642,7 @@ elif sys.argv[1] == "call_info":
> >      print_caller_info(filename, func)
> >  elif sys.argv[1] == "user_data":
> >      func = sys.argv[2]
> > +    filename = sys.argv[3]
> >      print_caller_info(filename, func, "USER_DATA")
> >  elif sys.argv[1] == "param_value":
> >      func = sys.argv[2]
> > 
> 
> Could you send me a normal patch with a Signed-off-by and I will apply
> it?  Otherwise I can handle it if you want.

Yes I will do this, though waiting for legal approval on contributing here first.

> 
> 
> > The database doesn't know that when do_mlock calls apply_vma_lock_flags the first argument of
> > apply_vma_lock_flags is the first argument of do_mlock. There is no data source associated and
> > our tracing of params stops early. Do you have any clue why this may be?
> 
> It's because the start variable gets modified in do_mlock():
> 
> mm/mlock.c
>    671  static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t flags)
>                                          ^^^^^^^^^^^^^^^^^^^
> 
>    672  {
>    673          unsigned long locked;
>    674          unsigned long lock_limit;
>    675          int error = -ENOMEM;
>    676  
>    677          if (!can_do_mlock())
>    678                  return -EPERM;
>    679  
>    680          len = PAGE_ALIGN(len + (offset_in_page(start)));
>    681          start &= PAGE_MASK;
>                 ^^^^^^^^^^^^^^^^^^
> Modified here.
> 
> I could change this behavior if you wanted.  I'd record the source and
> add an "[m]" or something to say that it had been modified in that
> function...  I'm tempted to change the "p 0" to "$0" but I would leave
> the "r function" format the same for now.

I think it would be helpful to record the source with an [m] modifier - I'm
sure there may be other users for this in the future. I'd be grateful if
you did that.

What is the difference between "p 0" and "$0" as the value? And the r function?

Also how would this work where a value is derived from multiple sources, e.g:

func2(int start, end)
{
}

func1(int addr, int len)
{
	func2(addr, addr + len)
}

The 'end' parameter in func2 is derived from both arguments 'addr' and 'len'
of func1. Perhaps it would be expressed as an additional line in the datbase?

|func1|func2|||8017|0|$|p0
|func1|func2|||8017|1|$|p0
|func1|func2|||8017|1|$|p1

I guess this changes the path into a tree.

Thanks,

Andrew Murray

> 
> regards,
> dan carpenter
> 



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux