Robots.txt
A file that tells search engine crawlers which pages on your site they can and can't access.
Robots.txt is a text file in your website's root directory (yoursite.com/robots.txt) that tells search engine crawlers which URLs they can access. It's like a set of instructions posted at the front door telling visitors which rooms they can enter.
The file uses simple directives: User-agent specifies which bot the rules apply to, Disallow blocks specific URLs or directories, Allow overrides a Disallow for specific paths, and Sitemap points to your XML sitemap file.
Important: robots.txt blocks crawling, not indexing. If other sites link to a page you've blocked in robots.txt, Google might still index the URL (without the content) based on those external signals. For preventing indexing, use the noindex meta tag instead.
Common robots.txt uses: blocking admin areas, search results pages, staging environments, and duplicate content paths. Don't block CSS or JavaScript files — Google needs them to render your pages properly.
Why It Matters for SEO
A well-configured robots.txt file helps search engines crawl your site efficiently, focusing on your important content. A misconfigured one can accidentally block your entire site from being indexed — one of the most devastating technical SEO mistakes.
🔍 How to Check This
Use AuditMySite's Robots.txt Generator to create an optimized robots.txt file for your site.
Try Robots.txt Generator →