Thank you Richard, I think I'd better explain a little about the project and
then you or somebody else might give some good suggestion upon the
restrictions of the project.
The project is to implement a digital library protocol, called oai_pmh
(http://www.openarchives.org/OAI/openarchivesprotocol.html)
It acts as a broker for the harvester client to get meta data from libraries
that have Z39.50 server. The database resides on libraries, and vary alot in
speed, number of records, way to accept connection from z39.50 client. The
number of records from some libraries might be over million. So the part
that getting data from those libraries behave very differently.
The harvester client sends http request, normally through program, like perl
LWP. It normally sets 180 time out for connection.
According to the protocol, the oai_pmh data provider act on respond to
harvester http request, it begin to connect to specific library's z39.50
server, get data in, write them to disk, and translate to another xml
format, then send to harvester client. If the records are too many, oai_pmh
should send back partial data with resumption token. The harvester can later
send out http request with same url but with the resumption token as one of
the POST variable to get further data records. This process can be
continueing till all the records has been send.
Thus I noramlly use perl program to send the http request and get content
instead of BROWSER. The behavior of buffer should not due to the setting of
the browser.
I can not echo the metadata directly back, since xlst need to use to
transform and new xml file(s) are writen. The header() redirction is very
nature to use if it can close the connection before I do something very time
consumming after that.
The exec with & and the cron job are hard to use, since connection to z39.50
with a lot of state variables like connection id, etc can not easily be
passed to another script.
The harvester user normally is not a human with browser but a piece of code,
while looping and sending out http requests if the page it gets back has
<resumptionToken> tag. (it replace the element between open and close tag of
<resumptionToken>, append to the next http request as POST variable for next
records page). But the problem is each http request posts a timeout of 180
seconds.
Thus I have to return partical data within 3 minutes while the whole process
might take hours or even days. Then the process continue to get data from
library server and transform it, then write to disk in a particular
directory. The next request with resumption token comes in, the program will
check for the existing of the directory and return if yes. If not existing,
program will check to return within 3 minutes or send back not available
information.
Sorry for the long writting. I hope some one has some suggestion for me.
Thank you very much.
-------------------------------------------------------------------------------
> I now encounter a problem with flow control of my program with PHP. This
> is
> very crucial to the design of a pretty big project. This is what I want
to
> do in the program:
>
> <?php
> do_A();
> header("Location: ".$result_of_do_A);
Depending on the buffering options in php.ini and/or Apache, this may or
may not just end your program, as I understand it.
Once you send the Location header, everything else is irrelevant, or at
least not reliable.
You could do:
echo $result_of_do_A;
flush();
and the user will see what happened with A, while waiting for B.
> do_B();
> ?>
>
> Since it takes do_B() quite a while to finish, so I want the http
client
> get the partial result from do_A() by redirect a page to them before
start
> do_B(). But it seems that the redirection will only occure after the
> entire
> php program finishes, i.e., after do_B(). I sent http request through
> browser, curl comman line with -N (no buffer) option and with a perl LWP
> program I wrote. All of them suggest that header(), although is put
before
> do_B() in code, gets executed only after all the php code finished. I
add
> flush() after header() too, but no work.
If that is what you are seeing happen, you probably have output buffering
turned "on"
The Location: header is acted upon by the BROWSER, not by PHP, not by your
server. The BROWSER sees that header and then jumps to somewhere else.
> My question is: Is there any way that I can return to the client though
> http
> response and then continue my progress with my program?
You could also look into the pcntl stuff to fork() or, depending on
various settings, you might get:
exec("do_B() &");
to get B to happen in the background.
With all that said: As a general rule, when I found myself doing this
kind of stuff, I later realized that I hadn't really designed my
application very well for an end-user experience.
If it takes THAT long to finish B, then you're probably MUCH better off
putting something in a "ToDo List" in your database, and doing B "later"
in a cron job.
Then notify the user through email or some kind of status display that
they will see on your site frequently when "B is done"
NEVER make the user sit around waiting for your server. Human time is far
far far too precious (and expensive!) to waste it sitting around doing
nothing useful waiting for your program to finish.
--
Like Music?
http://l-i-e.com/artists.htm
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php