Re: Curl with asp pages....

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 17, 2008 at 6:28 AM, ioannes <ioannes@xxxxxxxxxxxxxx> wrote:
> shiplu wrote:
>>
>> On Sun, Dec 7, 2008 at 8:22 AM, ioannes <ioannes@xxxxxxxxxxxxxx> wrote:
>>
>>>
>>> shiplu wrote:
>>>
>>>>
>>>> When you are dealing with curl, anything can be done as long as its a
>>>> HTTP
>>>> request.Its all about sending HTTP headers and content.
>>>>
>>>> To parse HTML content you can use HTML parser. Regular expression may
>>>> not
>>>> work each time.
>>>> Pattern changes over time.
>>>>
>>>> Download Wireshark. Collect 2 sample request and response packet from
>>>> there.
>>>> Make a format and use it with CURL.
>>>> Thats it. So Simple. You never gonna  need to know who is generating the
>>>> site, PHP or ASP.NET.
>>>>
>>>>
>>>>
>>>>
>>>
>>> I downloaded Wireshark onto Windows XP, got as far as Capture Options
>>> from
>>> Ethernet, Capture Filter is host <IP address of target page>, click
>>> Start,
>>> go to browser and access page, Stop Wireshark, Save captured file or
>>> Export
>>> as HTTP object which gives me the source of the page again.  Is this what
>>> you mean?   What do you mean by make a format - do you mean for instance
>>> parse the page with string finder functions etc.   How is this helping
>>> over
>>> identifying the correct POST variables (using LiveHTTP etc) of the
>>> request
>>> and feeding into a curl function?  What do you mean by 'make a format'
>>> versus 'pattern changes over time' - is format a Wireshark function, if
>>> so
>>> where do I find it.   Thanks, John
>>>
>>>
>>>
>>
>>
>> "make a format" is not like a button in wireshirk that has label "make
>> a format" and it will do everything for you. You have to do it
>> yourself. By wireshirk you'll see every type of headers and contents
>> for almost every type of protocols. So you'll use this soft for
>> analyzing the http conversation. Data will not only be in content but
>> also in headers. so parse both if needed. then use the same data and
>> make successive requests.
>> If you are using regular expression it will fail to match if pattern
>> changes. Your pattern '/<input type="hidden" name="__VIEWSTATE"
>> id="__VIEWSTATE" value="([^"]*?)" \/>/ will match <input type="hidden"
>> name="__VIEWSTATE" id="__VIEWSTATE" value="ABC7D5ACSE" /> but wont
>> match <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE"
>> value="ABC7D5ACSE">. Do you see the difference?? It wont mach <input
>> type="hidden" id="__VIEWSTATE" name="__VIEWSTATE"  value="ABC7D5ACSE"
>> /> too. Because the attributes order is changed. Your regex will not
>> work but their website will render very well. to overcome this, you
>> have to use html/xml parser. So you can go to input element. then look
>> for name attribute and if the name attribute is "__VIEWSTATE" then
>> fetch the value attributes content.  To make any input element name,
>> value attribute must be present. So your code will match every time.
>> It wont fail in 99.99% case.
>>
>> Hope that make sense
>>
>>
>
> Yes, thanks.  What HTML parser do you suggest?
>
> John
>

For php there is a dom extension.
Documentation can be found in http://www.php.net/dom

Thanks.

-- 
A K M Mokaddim
http://talk.cmyweb.net
http://twitter.com/shiplu
Stop Top Posting !!

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux