Robots.txt is a text file webmasters create
to instruct robots how to crawl and index pages on their website.
to be placed in the top-level directory of a web server in order to be useful.
file is what tells the search engines which pages to access and index on our
website on which pages not to. For example, if you specify in your Robots.txt
file that you don’t want the search engines to be able to access your thank
you page, that page won’t be able to show up in the search results and web
users won’t be able to find it. Keeping the search engines from accessing
certain pages on your site is essential for both the privacy of our site and
for your SEO. This article will explain why this is and provide
you with the knowledge of how to set up a good Robots.txt file.
files are useful if we want
If we want search engines to ignore any duplicate pages on your website
If we don’t want search engines to index your internal search results
If we don’t want search engines to index certain areas of your website or
a whole website
If we don’t want search engines to index certain files on your website
(images, PDFs, etc.)
If we want to tell search engines where your sitemap is located
mentioned above, the robots.txt file is a simple text file. Open a simple text
editor to create it. The content of a robots.txt file consists of so-called
contains the information for a special search engine. Each record consists
two fields: the user agent line and one or more Disallow lines. Here's an
robots.txt file would allow the "googlebot", which is the
search engine spider of Google, to retrieve every page from your site except
for files from the "cgi-bin" directory. All files in the
"cgi-bin" directory will be ignored by googlebot.The Disallow
command works like a wildcard. If you enter User-agent:
"/support-desk/index.html" and "/support/index.html" as
well as all other files in the "support" directory would not be
indexed by search engines.
important to update our Robots.txt file if we add pages, files or
directories to our site that we don’t wish to be indexed by the search engines
or accessed by web users. This will ensure the security of our website and the
best possible results with our search engine optimization.