RE: parsing form with a website question...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



rob,

i'm fully aware of the issues, and for the targeted sites that i'm focusing
on, i can employ strategies to prune the tree... but the overall issue is
that i'm looking for a tool/app/process that does what i've described.

the basic logic is that the app needs to use a config file, and that the app
should somehow find the requisite form using perhaps xpath, in combination
with some kind of pattern recognition/regex functionality...

once the app has the form, it can then get the underlying "stuff"
(selects/lists/items, etc.. which will form the basis for the querystrings
to the form action...

ain't life grand!!

thanks...



-----Original Message-----
From: Robert Cummings [mailto:robert@xxxxxxxxxxxxx]
Sent: Thursday, August 14, 2008 4:57 PM
To: bruce
Cc: php-general@xxxxxxxxxxxxx
Subject: Re:  parsing form with a website question...


On Thu, 2008-08-14 at 15:47 -0700, bruce wrote:
> Hi guys...
>
> Got a question that I figured I'd ask before I reinvent the wheel.
>
> A basic website has a form, or multiple forms.  within the form, there
might
> be multiple elements (lists/select statements, etc...). each item would
have
> a varname, which would in turn be used as part of the form action, to
create
> the entire query...
>
> sort of like:
> form action=test.php?
>  option
>   name=foo
>   foo=1
>   foo=2
>   foo=3
>   foo=4
>  /option
>
>  option
>   name=cat
>   cat=1
>   cat=2
>   cat=3
>  /option
> /form
>
> so you'd get the following urls in this psuedo example:
>  test.php?foo=1&cat=1
>  test.php?foo=1&cat=2
>  test.php?foo=1&cat=3
>  test.php?foo=2&cat=1
>  test.php?foo=2&cat=2
>  test.php?foo=2&cat=3
>  test.php?foo=3&cat=1
>  test.php?foo=3&cat=2
>  test.php?foo=3&cat=3
>  test.php?foo=4&cat=1
>  test.php?foo=4&cat=2
>  test.php?foo=4&cat=3
>
> i'm looking for an app that has the ability to parse any given "form" on a
> web page, returning the complete list of possible url combinations based
on
> the underlying elements that make up/define the form...
>
> anybody ever seen anything remotely close to this...???
>
> i've been research crawlers, thinking that this kind of functionality
would
> already exist, but so far, no luck!

A little algorithm analysis would learn you that to do so would require
storage space on an exponential scale... as such you won't find it.
Also, what would you put into text/textarea fields? I've heard Google
has begun experiments to index the "deep" web, but they just take
somewhat educated guesses at filling in forms, not at expanding the
exponential result set. For a simple analysis of the problem. Take 2
select fields with 2 options each... you have 4 possible outcomes (2 *
2). Now take 3 selects lists with 3 items, 4 items, and 5 items. You now
have 60 possible outcomes. From this it is easy to see the relation ship
is a * b * c * ... * x. So take a form with 10 select fields each with
10 items. That evaluates to 10^10 = 10000000000. In other words, with a
mere 10 drop down selects each with 10 items, the solution space
consists of 10 billion permutations. Now lets say each item costs
exactly 1 byte to store the answer, and so you need 10 bytes to store
one particular solution set. That's 100 billion bytes AKA 100 metric
gigabytes... remember that was just 1 form.

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux