What is the purpose of the robots meta tag and how to use it?

In one of the earlier articles, we have discussed about the role of robots.txt file which we use to tell the search engine robots not to crawl certain sections of your website directories and files. The Robots Meta Tag works in conjunction with the robots.txt file to provide additional information to the search engine robots. While the robots.txt provides a more generic guideline to search engine bots and largely deal with blocking full directories, the robots meta tag is a page/file specific instruction. It tells the bot whether to index the current page or not.

How would a bot go about crawling your website?

A search engine bot enters your website with the objective to crawl all pages of your website and index them. It first checks for existence of a robots.txt file in your website home directory. It it finds one, it will make a note of all directories and files that you have specifically instructed in the robots.txt file to not crawl. Accordingly, the bot will skip the specified directories and files and proceed to crawl through the other (allowed) directories and files. During crawl, the bot will read each page for indexing, but before doing so, it first checks whether the page contains a robots meta tag. If the page does, the bot will follow the instruction in the tag. If the instruction in the tag advises it not to index the page, it will ignore the page and will not index it.

Let us take a look at the syntax of the robots meta tag. A typical robots meta tag will look like as shown below:

<html>
  <head>
  ...  
  <meta name="robots" content="index, follow">
  ...
  </head>
  <body>

  - your html code in this section -
  
  </body>
</html>

If you add the above meta tag in the home page (index page) of your website, it tells search engine bots to index all pages of your website. Coupled with the above, if you also have a robots.txt file placed in your website home directory, search engine bots will ignore files in the debarred directories and crawl and index all pages under other directories, including your home directory.

This tag is not case-sensitive. So, the above meta tag can also be written as either of the below:

<meta name="robots" content="INDEX, FOLLOW">
<META NAME="robots" CONTENT="INDEX, FOLLOW">
<META NAME="robots" CONTENT="index, follow">

As you would have already noticed in the above illustration, like any other <META> tag, the robots meta tag too should be placed in the HEAD section of your HTML page. You can put it in every page of your website, or you may choose to put it only in certain specific pages, as per your requirement.

The tag has two attributes. The name attribute always takes the value robots implying that this directive is for search engine robots. Offcourse you can target specific bots by assigning a value of the specific bot instead of the more general robots. For instance, if you want to specify a directive only for google bot, the name attribute should be assigned a value googlebot. In such as case you can have multiple entries of the robots meta tag to target various bots. The content attribute can take a meaningful combination of the following values - index, noindex, follow, nofollow. These values are separated by comma. Valid combinations are -

a. <meta name="robots" content="index, follow">
b. <meta name="robots" content="index, nofollow">
c. <meta name="robots" content="noindex, follow">
d. <meta name="robots" content="noindex, nofollow">

What does each of the above mean?

Index all pages of the website except the ones debarred by robots.txt file.
Index only this page.
Do not index this page, but crawl through the other pages.
Do not index any of the website pages. If this is what you want, you may simply add the robots meta tag in only the home page of your website.

Note that if there is no robots meta tag on a page, it essentially is equivalent to the default, which is -
<meta name="robots" content="index, follow">.

Thus, using the robots meta tag you can specify indexing policy for individual pages of your website. This is particularly useful when you need to stop the search engine bots from indexing duplicate pages so that you do not lose your search engine rank on account of duplicate content.

It may be noted that the instruction you provide in the robots meta tag is only an advise to the search engine robot. Whether the robot follows your advice or not is their prerogative.

Subscribe to How2Lab RSS feed

Buy Domain & Hosting from a trusted company
Web Services Worldwide | Hostinger

How to build a search engine friendly sitemap?

Building multiple sitemaps for very large websites

How to tell search engines not to crawl your entire website?

Website not getting listed in search engines?

How to resolve the issue of receiving same email message multiple times when using Outlook?

Self Referential Data Structure in C - create a singly linked list

Mosquito Demystified - interesting facts about mosquitoes

Elements of the C Language - Identifiers, Keywords, Data types and Data objects

Moving Email accounts from one cPanel server to another

How to pass Structure as a parameter to a function in C?

Iframe Hacking

Data Input and Output functions in C

About the Author

Rajeev Kumar

CEO, Computer Solutions
Jamshedpur, India

Rajeev Kumar is the primary author of How2Lab. He is a B.Tech. from IIT Kanpur with several years of experience in IT education and Software development. He has taught a wide spectrum of people including fresh young talents, students of premier engineering colleges & management institutes, and IT professionals.

Rajeev has founded Computer Solutions & Web Services Worldwide. He has hands-on experience of building variety of websites and business applications, that include - SaaS based erp & e-commerce systems, and cloud deployed operations management software for health-care, manufacturing and other industries.

tech guide & how tos..

HowTo Categories

What is the purpose of the robots meta tag and how to use it?

How would a bot go about crawling your website?

What does each of the above mean?

Related Posts

Most Popular

Rajeev Kumar

CEO, Computer Solutions
Jamshedpur, India

tech guide & how tos..

HowTo Categories

Gadgets

Business

Internet

Programming

Leisure

What is the purpose of the robots meta tag and how to use it?

How would a bot go about crawling your website?

What does each of the above mean?

Related Posts

Most Popular

Rajeev Kumar

CEO, Computer SolutionsJamshedpur, India

CEO, Computer Solutions
Jamshedpur, India