> This is really two questions about Google indexing > Let's say I have a site in PHP & MySQL. > Let's say that I have some links on my site that use GET variables to > call other PHP pages and pass them a GET variable, like > > http://www.somesite.com/somedir/somepage.php?flag=15 > > The "flag" variable is passed to somepage.php and read by the script > using $_GET['flag'] etc, etc. When you look at the page with flag=15, > you get one page, when you look at it with flag=14 you see a similar > page with completely different content (record #14 instead of #15 > obviously) > > Will Google see both pages if I have both linked with <A HREF=""> > tags? Or will it stop at the question mark, only loading the page > somepage.php and ignore the ?flag=14 and ?flag=15 or whatever? Will > it index ?flag=14 and ?flag=15 as two separate pages (which is really > what I want, since they produce different content), or will it treat > both as the same page? If Google (et al) see the pages AT ALL, they will see them as separate pages. They have different URLs <==> they are different pages Some search engines skip all URLs with ? in them. Others selectively use the URLs with ? in them. It's possible (but unlikely) that some search engines use *all* the URLs with ? in them. I don't know what Google, specifically, will do under what circumstances, much less what they might decide to do tomorrow or next year. > SECOND QUESTION, RELATED: > > Same scenario, but with a POSTed form. I have several hidden FORM > fields, an a drop-down, and depending on how you submit the form you > get different content on the resulting page. > > Will Google submit the form, perhaps a couple of different ways and > treat each resulting page differently, or will it just bypass the > form altogether? I don't know of *ANY* search engine that will POST data to get to content. I sincerely doubt they would want to do that, really. > THIRD QUESTION: > > If the answers to the questions above are Yes and No, then I could > use a dynamically generated list of links with ?flag= to make Google > crawl through the part of the MySQL content (as displayed through the > scripts in HTML) that I want it to, using links and GET variables, > right? Maybe. Google might do it, while others won't and vice versa. > If the answers to the questions above are No and No, do I have to set > up a static .php page for EVERY record in my MySQL database to make > it see that content I want it to see? Does anyone use the error.php > page to catch for a 404 Not Found error, see if it can match the > "ghost" name to a record in the DB, and display a page anyway (even > though technically there is no somepage.php page, the error.php page > knows to go look in the databsae for "somepage" and displays its > content)? I wonder if this would be a good optimization strategy. There are several other possible solutions: 1. Use your robots.txt file to send the search engines to a "secret" page that links to all your content, asking the engine to index that page. 2. Use Apache's mod_rewrite module to change URLs like: http://example.com/page.php?flag=14 to URLs like: http://example.com/page.php/flag/14 or http://example.com/page.php/flag=14 or http://exmaple.com/page/flag=14/page.htm 3. Use PHP and $_SERVER['PATH_INFO'] to do all the same things as in #2. There are many examples/articles "out there" how to do this. Google for "PHP $_SERVER PATH_INFO" and you should find some. Also be sure to Google for "robots.txt search engines" to find out more about the robots.txt file -- While I don't use it much myself, others find it useful. Finally, a note of caution. At some point, if you have *enough* records, you don't want to make the URLs look static. If you do, and force Google (et al) to index, say, a MILLION relatively un-interesting pages... Put yourself in Google's shoes: "Hey, here's this goofball that made us index 1,000,000 pages of uninteresting content. Let's just put him on the blacklist and not index his site at all." Use some common sense here, or suffer the consequences. Let me give you an example: Suppose you were responsible for maintaining a list of, oh, I don't know, registered Republicans/Democrats. Further suppose, for some reason, that making this list public on the web was legal (I dunno) and you wanted to, or, more likely, your boss wanted you to do that. You *could* have a site where every registered voter was on their own page with an elephant or donkey, and you *could* make static-looking URLs to force them to get Googled... Or you could have static-looking *pages* so every page has a couple hundred, or even a thousand peope, to get Googled, with a nice common masthead with the elephant or donkey. If you force the search engines to index those zillion pages, one for each person, you're going to make somebody cranky. Somebody you *want* to be friends with. OTOH, if you arrange it so they aren't indexing *too* many pages, and the content is useful to potential visitors, they'll like you. I can definitely state, for the record, that it's VERY effective to make your URLs look static -- I maintain a free online database of music venues for touring indie musicians, and used to have dynamic URLs. Only a few days after changing to static URLs, I suddenly noticed that when I was searching for the venues that were out of date, *my* pages were popping up very high in the rankings. In fact, if the venue had a site, my page was usually right after theirs. If they had no site, my page was turning out #1 almost all the time. (For venues that had distinctive names.) This is not because I'm some search engine expert, but because the content being seached for matched the content I was delivering. If you Google for those venue names, pretty much the CHaT site will come up really high in the list. Google for a venue that's been closed for awhile, and we're pretty much the only result. [aside] We track closed venues because artists are often steered toward them by other out-of-date references or well-meaning former residents of their target destinations as they tour around the country. It also helps to know about a venue that closes/re-opens, Under New Management, and closes/re-opens again on a routine basis. These are generally music venues you want to avoid, because often the reasons for their closing/re-opening will adversely affect your desire to perform there. [/aside] Be sure to convert your internal links to the static-looking links, even if you code it so both work equally well, as I did to keep legacy URLs valid. The search engines will like your pages better because they'll find your internal links. You can see examples here: http://chatmusic.com/venuealpha/a Feel free to add your favorite music venue if it's not in there! -- Like Music? http://l-i-e.com/artists.htm -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php