Sitemap Generator (sm.php)
I have millions of pages on this site and lots of pages are added everyday. So I became quite frustrated
with the Python based sitemap software supplied by Google. Eventually that frustration caused me to write
my own directory walking sitemap software and I've decided to release that software under GPL version 3.
It uses the same config files as the Python software and is capable of submitting sitemaps to multiple
search engines as well.
I have only tested this software on Linux. If you run it on another OS, please let me know how well it works.
Problems I found with the Python based sitemap generator
- Walks directories that are excluded in the config file. This can be a major loser if the directory
being excluded is large.
- After stat()'ing every file in a directory it then does an lstat() on every file. If you really need
to do the lstat() call (and you don't!), it would be much faster to do it right after the stat() when the inode is
- Unless TZ is set in the environment, it'll stat() the timezone config file after every stat() call to make sure the timezone hasn't change in the last few milliseconds.
- Uses large amounts of memory (hundreds of MB in my case).
- Uses a lot of CPU time.
- Reads the entire set of directory entries into memory before processing them. All that is really needed is to read one at a time in a loop.
All these problems have been addressed in the software I wrote.