Re: Question about how to fetch html?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I suggest using a server side language like PHP to do the heavy
lifting for you. There are pre-existing functions out there that
already take care of this.

On Fri, Jun 12, 2009 at 5:47 PM, Brian Kim<09su.research@xxxxxxxxx> wrote:
> Thanks.
>
> Sorry for unclear explanations.
>
> Basically I want to make my proxy system do (1) parsing the html data,
> (2) analyzing the html data, (3) modifying some of the html data and
> then sending it to users.
>
> Here, the problem is that it is hard to finish (1), (2) & (3) jobs
> before sending the html data.
>
> I am looking for the way of implementing it.
>
> Does André or anybody have any idea?
>
>
>
> On Fri, Jun 12, 2009 at 5:15 PM, André Warnier<aw@xxxxxxxxxx> wrote:
>> Brian Kim wrote:
>>>
>>> Hi all.
>>>
>> Hi Brian.
>>
>>> Currently I am creating a http-based proxy system to fetch a html data
>>> between users' browser ans web server.
>>
>> That's usually what browsers do already, but ok..
>>
>>>
>>> In fact, I did it by adding some code in
>>> ap_proxy_http_process_response function as follows
>>>
>> ... some courageous lines removed for clarity here ...
>>
>>>
>>> By the way, this given code seems to repeat getting a partial html and
>>> passing it down.
>>
>> Yep, that's usually what webservers do. And proxies too.
>> Suggestions and patches for improvement are always welcome though.
>>
>> However, I want to parse the complete html, analyze it and send it to users'
>> browser.
>>
>> Well, the browser already does that all by itself, so it's not clear what
>> your purpose is, here.
>>
>>>
>>> By concatenating the partial html, I can create a complete html data
>>> and parse it.
>>
>> That basic idea is ok.
>>
>> However, it only can happen after the html is already sent according to the
>> above program structure.
>>
>> Don't understand exactly what structure you're referring to, but yes, it is
>> hard to parse the html before the server sent it.
>>
>>>
>>> Does anyone know about how to fix this problem?
>>
>> It is not quite clear what the problem is, here.
>>
>> Is there any general way to fetch html?
>>>
>>
>> With a browser maybe ?
>>
>> ;-)
>>
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
>>  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
>> For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx
>>
>>
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
>   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
> For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx
>
>



-- 
A: It reverses the normal flow of conversation.
Q: What's wrong with top-posting?
A: Top-posting.
Q: What's the biggest scourge on plain text email discussions?

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
   "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux