Re: [users@httpd] MSN-Bot doesn't finish a download?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you for your reply!

Joshua Slive wrote:

On 5/2/05, Abu Hurayrah <abu_hurayrah@xxxxxxxxxxxxx> wrote:
Greets to all!

I've just noticed that when the MSNbot crawls my website and hits some of my
downloads, it doesn't download the whole file.

Most search engines are only interested in the first x bytes of the
file, so the bot may simply be dropping the connection after it gets
what it want.s
That's a fair enough assumption, however, the size of the chunk that is downloaded is ALWAYS the same size as my $chunk_size value in my download script.

MSN seems to only catch ONE chunk, no matter what size I make it, which I
find very strange, because I cannot think of why my implementation would
matter to MSN or not.

What is the smallest "chunk" size you have tried?  You are probably
just not detecting the dropped connection until you have sent a chunk,
so you don't really knwo what the bot is accepting.

Joshua.
I've tried sizes ranging from 50,000 bytes to 500,000 bytes, and always, MSN gets only that much.

Previously, MSN would download the ENTIRE files, when I was sending these files all at once. I cannot understand the mechanism that prevents it from continuing downloading the entire file, despite the fact that I partition the download into these discrete chunks. I am not mangling the data in any way I know, I am simply sending it down in chunks to reduce the memory footprint of each of my download script's instances.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux