White Hat (Ethical) SEO
In search engine optimization (SEO) terminology, White Hat SEO refers to the usage of SEO strategies, techniques and tactics that focus on a human audience opposed to search engines and completely follows search engine rules and policies.
robots.txt
The robots.txt file is a set of instructions for visiting robots (spiders) that index the content of your web site pages. For those spiders that obey the file, it provides a map for what they can, and cannot index. The file must reside in the root directory of your web.
Definition of the above robots.txt file :
Robots.txt Creation:
To all robots out
User-agent : *
Disallow : / To
allows all robots
1. Way
User-agent : *
Disallow :
Finally, some crawlers now support an additional field called "Allow:", most notably, Google.
2. Way User-agent : *
Allow : /
To prevent pages from all crawlers
User-agent : *
Disallow : /page name/
To prevent pages from specific crawler
User-agent : GoogleBot
Disallow : /page name/
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
Disallow the images directories
User-agent : *
Disallow: /images/
Disallow specific file types
User-agent : *
Disallow : /*.gif$
User-agent : *
Disallow : /*.jpg$
User-agent : *
Disallow : /*.png$
Then save it as robots.txt and upload it to your ROOT directory. The root directory is where you store your index page on your host's server.
Robots Meta Tag
If your web host prohibits you from uploading "robots.txt" to the root directory, or you simply wish to restrict crawlers from a few select pages on your site, an alternative to "robots.txt" is to use the robots meta tag.
Creating your "robots" meta tag
The "robots" meta tag looks similar to any meta tag, and should be added between the HEAD section of your page(s) in question:
<meta name="robots" content="noindex,nofollow" />
Here's a list of the values you can specify within the "contents" attribute of this tag:
Value
Description
(no)index
Determines whether crawler should index this page. Possible values: "noindex" or "index"
(no)followDetermines whether crawler should follow links on this page and crawl them. Possible values: "nofollow" and "follow."