Wednesday, February 13, 2008

Can U request Google, Yahoo to not index Ur site?

I knew the way web crawlers/bots work to index your website. In fact Google also has a feature of submitting a SiteMap to better index the pages in Ur site. But what if U don't want some pages to be crawled. Well, today I learned that there is a way in which we can request crawlers to ignore certain pages in the site. The trick is to place a 'robots.txt' file in the root directory of the site. This text file contains folders and URLs that need not be crawled.
The protocol, however, is purely advisory. It relies on the cooperation of the web robot, so that marking an area of a site out of bounds with robots.txt does not guarantee privacya

No comments:

Post a Comment