Ajax software
Free javascripts
↑
Main Page
More robots.txt Examples
One strange-looking edge case is where you don’t place any restrictions on the robot explicitly. In the
following snippet, the empty
Disallow
directive means “no exclusion” for any robot. It is equivalent
to having no
robots.txt
file at all:
User-agent: *
Disallow:
To exclude multiple URLs, simply enumerate them under a single
User-agent
directive. For example:
User-agent: *
Disallow: /directory
Disallow: /file.html
This will exclude any file that begins with
/directory
, and any file that begins with
/file.html
.
To use the same rules for multiple search engines, list the
User-agent
directives before the list of
Disallow
entries. For example:
User-agent: googlebot
User-agent: msnbot
Disallow: /directory
Disallow: /file.html
robots.txt Tips
In theory, according to the
robots.txt
specification, if a
Disallow:
for user agent
*
exists, as well a
Disallow:
for a specific robot user agent, and that robot accesses a web site, only the more specific rule
for that particular robot should apply, and only one
Disallow:
would be excluded. Accordingly, it is
necessary to repeat all rules in
*
under, for example, googlebot’s user agent as well to exclude the items
listed for
User-agent: *
.
Thus, the following rules would only exclude Z from googlebot, not X, Y, and Z as you may think:
User-agent: *
Disallow: X
Disallow: Y
User-agent: googlebot
Disallow: Z
If you want X, Y, and Z excluded for googlebot, you should use this:
User-agent: *
Disallow: X
Disallow: Y
User-agent: googlebot
Disallow: X
Disallow: Y
Disallow: Z
101
Chapter 5: Duplicate Content
c05.qxd:c05 10:40 101
Ajax software
Free javascripts
→