What is robots.txt

Technically, robots.txt is a simple text file placed in the root directory of a website (e.g., www.example.com/robots.txt). It adheres to the Robots Exclusion Standard, guidelines for how web crawlers should behave when visiting a website. This file contains instructions in the form of "directives" that tell bots which parts of the website they can and cannot crawl.

Directive	Description	Example
`Disallow`	Specifies paths or patterns that the bot should not crawl.	`Disallow: /admin/` (disallow access to the admin directory)
`Allow`	Explicitly permits the bot to crawl specific paths or patterns, even if they fall under a broader `Disallow` rule.	`Allow: /public/` (allow access to the public directory)
`Crawl-delay`	Sets a delay (in seconds) between successive requests from the bot to avoid overloading the server.	`Crawl-delay: 10` (10-second delay between requests)
`Sitemap`	Provides the URL to an XML sitemap for more efficient crawling.	`Sitemap: <https://www.example.com/sitemap.xml`>