Crawl Control
Robots.txt Generator
Generate a clean robots.txt file for launch surfaces, docs hubs, and product shells without hand-formatting directive blocks.
Use notes
Robots.txt controls crawler access. It does not guarantee deindexing for URLs that search engines already know about.
Rule groups
Add one directive block per crawler or crawler family.
Directive blocks
Block 1
One path per line.
One path per line.
Sitemaps
List the sitemap files you want crawlers to discover.
One sitemap URL per line.
robots.txt output
IdleNo result yet
Run the tool to see the result here.
Trust
How this tool handles the task
How it runs
The generator builds the file from the form values in this page. It does not fetch anything from a live site.
Current limits
Only the blocks and sitemap lines in this form appear in the output. Review the final text before publishing it.
Privacy
Draft rules stay in the browser unless you choose to copy or download the result.
Examples
How to use this tool
- Add one crawler group per directive set you want to publish.
- List allow, disallow, and sitemap lines exactly as they should appear in the file.
- Copy the generated output into your deployed `robots.txt` route, then validate the live result.
Common mistakes and limits
- Blocking a URL in `robots.txt` does not remove it from search results by itself.
- Keep paths host-relative. Do not paste full URLs into allow or disallow directives.
- Use page-level `noindex` when you need indexing control instead of crawler blocking.
Next steps
Validate the published rule
Use these pages after you generate or deploy the file.
Ship the adjacent page signals
Use page-level controls when crawl directives are not enough.
FAQ
When should I use robots.txt instead of a noindex tag?
Use robots.txt to guide crawler access to paths or directories. Use a page-level noindex when the page can be crawled but should stay out of search results.
Does blocking a URL in robots.txt remove it from Google?
No. A blocked URL can still appear in search if other pages link to it. Deindexing usually needs a crawlable page-level noindex or a removal workflow.
Why create separate user-agent blocks?
Separate blocks help when a crawler needs different allow, disallow, or crawl-delay directives than your wildcard rules.