On Sun, Dec 7, 2008 at 8:22 AM, ioannes <ioannes@xxxxxxxxxxxxxx> wrote: > shiplu wrote: >> >> When you are dealing with curl, anything can be done as long as its a HTTP >> request.Its all about sending HTTP headers and content. >> >> To parse HTML content you can use HTML parser. Regular expression may not >> work each time. >> Pattern changes over time. >> >> Download Wireshark. Collect 2 sample request and response packet from >> there. >> Make a format and use it with CURL. >> Thats it. So Simple. You never gonna need to know who is generating the >> site, PHP or ASP.NET. >> >> >> > > I downloaded Wireshark onto Windows XP, got as far as Capture Options from > Ethernet, Capture Filter is host <IP address of target page>, click Start, > go to browser and access page, Stop Wireshark, Save captured file or Export > as HTTP object which gives me the source of the page again. Is this what > you mean? What do you mean by make a format - do you mean for instance > parse the page with string finder functions etc. How is this helping over > identifying the correct POST variables (using LiveHTTP etc) of the request > and feeding into a curl function? What do you mean by 'make a format' > versus 'pattern changes over time' - is format a Wireshark function, if so > where do I find it. Thanks, John > > "make a format" is not like a button in wireshirk that has label "make a format" and it will do everything for you. You have to do it yourself. By wireshirk you'll see every type of headers and contents for almost every type of protocols. So you'll use this soft for analyzing the http conversation. Data will not only be in content but also in headers. so parse both if needed. then use the same data and make successive requests. If you are using regular expression it will fail to match if pattern changes. Your pattern '/<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="([^"]*?)" \/>/ will match <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="ABC7D5ACSE" /> but wont match <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="ABC7D5ACSE">. Do you see the difference?? It wont mach <input type="hidden" id="__VIEWSTATE" name="__VIEWSTATE" value="ABC7D5ACSE" /> too. Because the attributes order is changed. Your regex will not work but their website will render very well. to overcome this, you have to use html/xml parser. So you can go to input element. then look for name attribute and if the name attribute is "__VIEWSTATE" then fetch the value attributes content. To make any input element name, value attribute must be present. So your code will match every time. It wont fail in 99.99% case. Hope that make sense -- A K M Mokaddim http://talk.cmyweb.net http://twitter.com/shiplu Stop Top Posting !! -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php