> > Look at this (completely untested) loop: > > # a little setup > cmd=`basename "$0"` > : ${TMPDIR:=/tmp} > tmppfx=$TMPDIR/$cmd.$$ > > i=0 > while read -r url > do > i=$((i+1)) > out=$tmppfx.$i > if curl -s "$url" >"$out" > then echo "$out" > else echo "$cmd: curl fails on: $url" >&2 fi & > done < myURLs \ > | while read -r out > do > cat "$out" > rm "$out" > done \ > | tee all-data.out \ > | your-data-parsing-program I understand the script, although I haven't tested it either. My take on it: + it solves the problem of curls overwriting (I think) + the data parsing and tracking is done on the combined curls - it retrieves the urls serially, not in parallel - it writes them to disk - it re-reads them from disk, hence some disk activity, although probably insignificant relative to the download time. The way I'm doing it now is this: I do the retrieval and the parsing and tracking all within a single program. For each url I create a separate thread from which I call curl and get its output, then parse. Like this: // inside each thread: const size_t bufSize_ = 1<<20; // 1Mb, sufficiently large char buf_[bufSize_]; // local to each thread // form the curl command = "curl -s $url" fd_=popen(curl_command, "r"); // omit error checking here nRead_=fread(buf_, sizeof(char), bufSize_, fd_); pclose(fd_); parse(buf_); // to a struct visible from all threads // when threads done, analyze the combined info. This works, but I would have liked a more modular solution. I want the url retrieval to be a separate, standalone entity and the parsing and tracking another entity (possibly two entities). Hence, what I want is - in a shell - download in parallel - merge curl outputs then pipe into the parser/tracker. Parsing can be done per url, but tracking MUST be across urls. -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org