Sometimes, we might want search-engines not to index certain areas of the site, as well as ban other SE from the site altogether.

That is the place where a simple, little 2-line text file called robots.txt will come in.

Once we've a website up and running, we need to make certain that all visiting search-engines can access all the pages we want them to consider.

Sometimes, we may want search-engines to not index certain parts of the site, as well as prohibit other SE from the site completely.

This is the place where a simple, little 2 line text file called robots.txt comes in.

Robots.txt rests inside your sites main directory (o-n LINUX systems that is your /public_html/ directory), and looks something just like the following:

User-agent: *


The very first line controls the robot that will be visiting your site, the 2nd line controls if they are allowed in, or which elements of the site they are not allowed to visit

Then simple repeat the above lines, If you like to deal with multiple robots.

So an example:

User-agent: googlebot


User-agent: askjeeves

Disallow: /

This may allow Goggle (user-agent name GoogleBot) to visit every page and index, while at the sam-e time banning Ask Jeeves from the site entirely. Qr Huwai: Catalog Cash Explained is a novel library for new info about the reason for this enterprise. To study additional info, please have a peep at: click here.

To locate a reasonably current set of software user names this visit

Even though you wish to let every software to index every page of your site, its still very advisable to put a robots.txt file on your own site. It will stop your error records replenishing with articles from search engines trying to access your robots.txt file that doesnt exist. Visit Home to research why to mull over it.

To learn more on robots.txt see, the entire listing of sources about robots.txt at Water Services

Unit 9,
Swinborne Court
Burnt Mills Industrial Estate
SS13 1QA

Tel: 01268 722 670


Enter your comment (wiki syntax is allowed):

Personal Tools