Robots.TXT Essential Configuration Guide

Robots.TXT Essential Configuration Guide

Robots.TXT Essential Configuration Guide

Now Robots.TXT Essential Configuration Guide for Consultants, Agencies and Digital Marketing & SEO Freelance Professionals know, or should know, just how important a search engine is to index your content. A number of techniques are applied to make your customers’ websites better crawled by Google, and to have total indexing of your pages and content through on-page and off-page actions such as content enhancement, links, tags, meta-descriptions, image optimization, Robots.txt, etc.

But as I say in my lectures and lectures, SEO On-Page can be divided into two parts: Content and Technique . The content part works with text optimization for keywords, image size, creation of internal and external links. But the technical part is responsible for creating XML Sitemap, Microformats, Google AMP, Robots.txt and Meta Robots.

Robots are best known for the “old school” galley that came from the development area. But if you’ve never heard of robots, do not be scared. This post is made for both you who are learning about robots.txt and those who already understand and just want to give up on your visits. Come on?

What is Robot.txt?

Robot.txt is a text file that is used to instruct the robots / spiders used by search engines (such as Google and Bing) to crawl and index the pages of your site. The robots.txt file is placed in the main directory of your site so that these robots can access this information immediately.

To prevent each search engine from defining specific rules for its crowlers , they obey a standard called REP – Robots Exclusion Protocol , created in 1994 and last modified in 2005.

Because robots.txt files provide search bots with instructions on how to crawl or not crawl certain parts of the site, knowing how to use and setting up such files becomes vital to any SEO professional. If a robots.txt file is incorrectly configured, it can cause multiple indexing errors. So every time you start a new SEO campaign, check out your robots.txt file with the Google Robot Test Tool.


Using Robots to “Hide” Your Content

Robots.txt files can be used to exclude certain SERP directories from all search engines. For this, the property “disallow” is used.

Here are some pages you should hide using a robots.txt file:

Pages with duplicate content
Paging Pages
Thank you pages
Shopping Cart Pages
Administration Pages
Pages with account information
Dynamic pages of products and services (which vary greatly)

Here are some pages you should hide using a robots.txt
Here are some pages you should hide using a robots.txt

However, do not forget that any robots.txt file is publicly available on the internet. To access a robots.txt file, simply type:

How to use Robots.txt

Robots.txt files allow a wide choice of settings. Their main benefit, however, is that they allow SEO specialists to “allow” or “disallow” multiple pages at once without having to access the code for each page one by one.

For example, you can block all search crawlers with this command:

User-agent: *
Disallow: /
Or hide the structure of your site and specific categories:

User-agent: *
Disallow: / no-index /

We can also exclude several pages from the search. Just hide these crowler pages with the “disallow” command:

How to use Robots.txt
How to use Robots.txt

One of the best things about working with robots.txt is that it allows you to prioritize certain pages, categories and even pieces of CSS and JS code. Take a look at the example below:

Using Robots to "Hide" Your Content
Using Robots to “Hide” Your Content

In the example, we do not allow WordPress pages and specific categories, but wp content files, JS plugins, CSS styles and blog styles are allowed. This approach ensures that spiders crawl and index useful codes and categories.

Top Commands for Robots.txt

Before finishing this post, I will list the main commands and functions so that you can configure your robots.txt in any text editor:

To index all content:
User-agent: *


User-agent: *

To not index all the content
User-agent: *
Disallow: /

To not index a specific folder
User-agent: *
Disallow: / folder /

For Googlebot not to index a folder, but to allow indexing of a file from this folder

User-agent: Googlebot
Disallow: / folder1 /
Allow: /folder1/my-page.html


The domain of robots.txt can be a very important factor for the success or failure of your SEO strategy. But besides the text files, we can work with the authorization of the spiders directly in our pages with the robots meta tags . But this content is for the next! Just jump to Robots.TXT Essential Configuration Guide.

Leave a Comment

Your email address will not be published.