HTML Author's Guide to the Robots Exclusion Protocol
The Web Robots Pages

HTML Author's Guide to the Robots Exclusion Protocol

The Robots Exclusion Protocol requires that instructions are placed in a URL "/robots.txt", i.e. in the top-level of your server's document space.

If you rent space for your HTML files on the server of your Internet Service Provider, or another third party, you are usually not allowed to install or modify files in the top-level of the server's document space.

This means that to use the Robots Exclusion Protocol, you have to liase with the server administrator, and get him/her add the rules to the "/robots.txt", using the Web Server Administrator's Guide to the Robots Exclusion Protocol.

There is no way around this -- specifically, there is no point in providing your own "/robots.txt" files elsewhere on the server, like in your home directory or subdirectories; Robots won't look for them, and even if they did find them, they wouldn't pay attention to the rules there.

If your administrator is unwilling to install or modify "/robots.txt" rules on your behalf, and all you want is prevent being indexed by indexing robots like WebCrawler and Lycos, you can add a Robots Meta Tag to all pages you don't want indexed. Note this functionality is not implemented by all indexing robots.


The Web Robots Pages