When optimizing your web site most webmasters don't consider using the robot.txt file. This is a very important file for your site. It lets the spiders and crawlers know what they can and cannot index. This is helpful in keeping them out of folders that you do not want index like the admin or stats folder.
Here is a list of variables that you can include in a robot.txt file and there meaning:
User-agent: In this field you can specify a specific robot to describe access policy for or a ?*? for all robots more explained in example.
Disallow: In the field you specify the files and folders not to include in the crawl.
Note: The # is to represent comments
Here are some examples of a robot.txt file .....
User-agent: *
Disallow:
The above would let all spiders index all content.
Here another
User-agent: *
Disallow: /cgi-bin/
The above would block all spiders from indexing the cgi-bin directory.
User-agent: googlebot
Disallow:
User-agent: *
Disallow: /admin.php
Disallow: /cgi-bin/
Disallow: /admin/
Disallow: /stats/
In the above example googlebot can index everything while all other spiders can not index admin.php, cgi-bin, admin, and stats directory. Notice that you can block single files like admin.php.
Robot.txt file ! , How to use it ?
Subscribe to:
Post Comments (Atom)
Archives
-
▼
2009
(3)
- ► September 2009 (1)
-
►
2008
(27)
- ► December 2008 (1)
- ► November 2008 (4)
- ► October 2008 (2)
- ► September 2008 (3)
- ► August 2008 (13)
About Me
Sponsors
Earn Thru Search - Search from SCOUR,get results from Google,Yahoo and MSN on one page.
Post a Comment