
Robots.txt Ultimate How To Tutorial |
|
| Search Engine Optimization |
| Monday, 07 January 2008 10:27 |
|
Robots.txt is a simple file in the root of your ftp server telling server bots (spiders - Google bot, Yahoo bot, Msn bot) what to do. You can easily tell them which part of your website you don't want to get indexed. What are robots - spiders?
There are good and bad (spam) bots, spiders. Good are for example bots coming from major search engines, and the bad are spam bots that are built and used only for negative purposes, like getting emails from user websites that can be used later for spam emails that we are all sick of. How to create a robots.txt file?It is very simple, create an empty file robots.txt and upload it to your website root folder (usually public html folder). Just make a file in notepad, rename it to robots and upload it with ftp to your site root folder. Not lets make it work. In first example we will address a specified bot, spider. TO do that just include this in the first line of you robots.txt file. User-agent: BotName Change BotName with the robot name you want to address. To use it for all robits simply use this line instead. User-agent: * The second part is to tell robots what parts of the website should not be crawled, visited.
Disallow: /docs/
For this example it means that any path on your website starting with the string /cgi-bin/ will not be crawled. You can put multiple lines for excluding different directories, files. Multiple paths can be excluded per robot by using several Disallow lines.
User-agent: *
Disallow: /docs/ Disallow: /temp/ Disallow: /mypictures
In this example robots.txt file would apply to all bots and instruct them to stay out of directories /docs/ and /temp/. The third line tells them to exclude all the urls starting with /mypictures, that goes for folders and files. (See how the last slash is not displayed). To prevent the access to the whole site for the specified bot just add this line in robots.txt file. Instead of BotName, put the name of the bot.
User-agent: BotName
Disallow: /
This robots.txt does not have any restrictions at all and allows all the bots to crawl the whole site and al the files and folders.
User-agent: *
Disallow:
Here is the short list of famous bots /spiders. Robot Name |
Dragon Naturally Speaking I complete...
Newly Updated Bookmarking Sites 2012 ...
Great thanks, looks like i have alot ...
I totally agree that the taxonomy suc...
Hey There. DId you resolve your p...
yes you are right, this company is ho...
For years and years now I have been r...
Thanks for posting this great article...
I just bought this dvd 1 week ago and...
Excellent read, I just passed this on...
Really meaningful picture
whats wrong with taxonomy, yes Joomla...
I love drupal in-fact I have now desi...
great work, how much you charge per p...
I agree that "the profiles are of...